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Using The Sun C Compiler 


This chapter describes how to compile C programs under the SunOS version of 
the UNIXt operating system running on Sun Microsystems’ Sun-3 and Sun-4 
(SPARC) workstations. 

If you are already familiar with using cc, (the UNIX C compiler), either on Sun 
workstations or on other UNIX systems, you can probably ignore or skim the rest 
of this chapter without regretting it later. 

If you need to learn about programming in C, or about SunOS programming 
tools, you should refer to one or more of the introductory books available that 
address the topic. 


1.1. Basics — Compiling This section shows how to compile and run a minimal C program. Consider this 

and Running C C program that just displays a message and exits: 

Programs 

#include <stdio.h> 

main ( ) 

{ 

printf ("Real Programmers Hack C ! \n") ; 
exit (0) ; 

} 


Using your preferred text editor, save the text of this program in a file called 
hackers . c. After you have saved the file, compile it with the cc command: 




tutorial% cc hackers. c 

tutorial% 

> 


cc works silently unless there are errors in the program. In this case, there are 
no errors, and cc compiles the program and saves an executable version of it in a 
file named a . out. 


t UNIX is a registered trademark of AT&T. 
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When you want to run the program, type the name of the executable file: 


r 


tutorial! a. out 


Real Programmers Hack C! 


tutorial! 


V 



Note that the program’s last line was an exit ( ) statement. If run interactively 
from a shell, either with or without the final line, this program will behave as 
expected: 


N 

tutorial% a. out 

Real Programmers Hack C! 

tutorial% 

V 


However, if the same program (minus exit ( ) ) is executed in an environment 
which examines the program’s exit status, unexpected results may occur. In par- 
ticular, if the program is executed from a Makefile, an unexpected error code 
may be reported: 


tutorial% cat Makefile 
example : example . c 

cc example. c -o example 
example 

tutorial! make 

cc example . c -o example 

example 

main returns value which is NOT ignored 
*** Error code 40 

make: Fatal error: Command failed for target 'example' 
tutorial! 


This strange message may be explained by noting that make examines the exit 
status of each program that it invokes, where the program’s exit status is the 
value returned by main ( ) or passed to exit ( ) . If main ( ) does not return a 
value or call exit ( ) , the exit status is undefined and the program is in error. 
This error may be detected by running system V lint on the suspect progr am ; 


tutorial! /usr/5bin/lint example. c 

example . c 

(4) warning: main() returns random value to invocation environment 
k , 


This program may be corrected by adding a return statement or a call to 
exit ( ) . 
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1.2. C Compiler 


1.3. cc Options 
—a Option 


—align _block_ Option 


More generally, if a function f ( ) is declared with a result type, but ends without 
returning a result, and the (undefined) result of f ( ) is used in an expression con- 
taining a call to f ( ) , then the program is in error. 

Some earlier versions of the compiler permitted programs that did not incor- 
porate either a terminating exit ( ) or return function. 


This section describes the compiler options supported by Sun Microsystems’ C 
compiler. Later sections cover specific dependencies and features of Sun C 
under SunOS. 


— , 

cc [options] filename [ libraries ] . . . 

, 


cc translates programs written in C into executable load modules, (or into relo- 
catable binary programs for later linking with ld(l)), and optionally links (or 
binds) the result with object files generated by cc or other language processors. 

cc accepts a list of C source files and various object files contained in the list of 
files specified by filename.... The resulting executable is placed in the file a.out, 
unless the (-o) option is specified (see below). 

cc lets you compile and link any combination of the following: 

□ C source files (with a . c suffix) 

□ C preprocessed source files with a . i suffix 

□ SunOS system object-code files with . o suffixes 

□ Assembler source files with . s suffixes 

After successfully linking, cc places the product of linking those files in the file 
a . out, or in the file specified by the -o option. Note that, unless otherwise 
specified, options may follow the the filename, as in cc file . c -o file. 


This option directs cc to insert code to count how many times each basic block 
in a program is executed. This creates a . d file for every . c file compiled that 
accumulates execution data for its corresponding source file. On the Sun-3, 
Sun-4, and SPARCStation, you can then run tcov(l) on the source files to gen- 
erate statistics about the program. 

Since this option entails some optimization, it is incompatible with -g. 

This option directs cc to page-align the uninitialized global uninitialized data 
symbol block, which is equivalent to a FORTRAN common block . This 
increases its size to a whole number of pages, and places its first byte at the 
beginning of a page. Multiple -align options may be given. 
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This option specifies whether bindings of libraries for linking are static or 
dynamic, indicating whether libraries are non-shared or shared, respectively. 

This option directs cc to suppress linking with ld(l) and produce a . o file for 
each source file. You can explicitly name a single object file with the -o option. 

-C Option This option prevents the C preprocessor, cpp(l), from removing comments. 

-dalign Option This option generates double load/store instructions whenever possible for 

improved performance. Assumes that all double typed data are double aligned, 
and should not be used when correct alignment is not assured. 

Sun-4 and SPARCstation only. 

-dryrun Option This option directs cc to show but not execute the commands constructed by the 

compilation driver. 

-Dname[=def\ Option This option defines a symbol name to the C preprocessor cpp(l). This is 

equivalent to a # define directive at the beginning of the source. If you don’t 
use =def name is defined as ‘1’. Multiple -D options may be given. 

-E Option This option runs the source file through cpp only. It sends the output to either 

stdout, or to a file named with the -o option (which must end with . i) and 
includes the cpp line numbering information. (See also, the -P option.) 

Floating-Point Options Sun supports several ways to perform floating-point calculations, both in 

hardware and software. The floating-point point options provided by cc permit 
you to choose the way that gives you the best performance and portability for 
your programs. 

The following floating-point code generation option can be used on Sun-3, Sun-4 
and SPARC systems: 

-f single This option directs cc to use single-precision arithmetic in com- 
putations involving only float expressions — that is, do not 
convert everything to double, which is the default. Note that 
floating-point parameters are still converted to double precision, 
and functions returning values can still return double-precision 
values. 

Although this is not traditional C practice, some programs run 
much faster using this option. Be aware that some significance can 
be lost due to lower-precision intermediate values. 

The floating-point code generation options useable on Sun-3s can be any of the 
following: 

- f 6 8 8 8 1 This option directs c c to generate in-line code for the Motorola 

MC6888 1 floating-point coprocessor (Sun-3 systems only). 

-f f pa This option directs cc to generate in-line code for the Sun 
Floating-Point Accelerator (Sun-3 systems only). 


-B binding Option 
-c Option 
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-g Option 

-go Option 

-help Option 
-Xpathname Option 


-J Option 

-1 library Option 


-L dir Option 
-M Option 


-f soft This option directs cc to generate software floating-point calls 
(this is the default for all Sun-3 workstations). 


-f st ore This option insures that expressions allocated to extended preci- 

sion registers are rounded to storage precision whenever an 
assignment occurs in the source code. 

Only effective if -f 68881 is specified (Sun-3 systems only). 

-f switch This directs cc to generate runtime- switched floating-point calls. 

The compiled object code is linked at mntime to routines that sup- 
port one of the above types of floating-point code. This option is 
not recommended. 


This option produces additional symbol table information for dbx( 1) and 
dbxtool( 1), and passes the -lg flag to ld(l) so as to include the g library, 
/usr/lib/libg . a. This option suppresses the -0 and -R options. 

This option produces additional symbol table information for adb(l). When this 
option is given, the -0 and -R options are suppressed. 

This option displays information about cc . 

This option adds pathname to the list of directories that are searched for 
#include files with relative filenames (those not beginning with slash /). 

The preprocessor first searches for #include files in the directory containing 
sourcefile, then in directories named with - 1 options (if any), and finally, in 
/usr /include. Programs that use systems calls, for example, would need to 
use the file types . h as one of their #include files, types . h contains 
many type definitions used by common system calls. 

This option generates 32-bit offsets in swit ch() statement branches (Sun-3 sys- 
tems only). 

This option directs Id to link with object library library. The ordering of 
libraries in the compile line is important, as symbols are resolved from left to 
right. 

Note This option must follow the sourcefile arguments. 


This option adds dir to the list of directories containing object-library routines 
(for linking with Id). 

This option runs only the macro preprocessor on the named C programs, request- 
ing that it generate makefile dependencies and send the result to the standard out- 
put (see make{ 1) for details about makefiles and dependencies). 
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-misalign Option 

-o outputfile Option 


-0 [level] 


-p Option 


-P Option 


-pg Option 


-pic Option 


-PIC Option 


This option generates code to allow loading and storage of misaligned data 
(Sun-4 and SPARC systems only). 

This option names the output file outputfile. outputfile must have the appropriate 
suffix for the type of file to be produced by the compilation, outputfile cannot be 
the same as sourcefile since cc will not overwrite the source file. 

This option directs cc to optimize the object code. It is ignored when either -g, 
-go, or - a are used, level can be one of the following: 

1 Do postpass assembly-level optimization only. 

2 Do global optimization before code generation, including loop optimiza- 
tions, common subexpression elimination, copy propagation, and automatic 
register allocation. -02 does not optimize references to or definitions of 
external or indirect variables. 

3 Same as -02, but optimize uses and definitions of external variables. -03 
does not trace the effects of pointer assignments. Neither L-03 nor -04 
should be used when compiling either device drivers, or programs that 
modify external variables from within signal handlers. 

4 Same as -03, but traces the effects of pointer assignments. 

Note If you use -O without specifying the level, it is equivalent to using -02. 

This option prepares the object code to collect data for profiling with prof (1). 
-p invokes a run-time recording mechanism that produces a mon.out file at nor- 
mal termination. 

This option runs the source file through cpp(l), the C preprocessor, only. It then 
puts the output in a file with a . i suffix. Does not include cpp-type line number 
information in the output. 

This option prepares the object code to collect data for profiling with gprof (1). 
It invokes a run-time recording mechanism that produces a gmon . out file at 
normal termination. 

This option produces position-independent code. Each reference to a global 
datum is generated as a dereference of a pointer in the global offset table. Each 
function call is generated in pc-relative addressing mode through a procedure 
linkage table. The size of the global offset table is limited to 64K on Sun-3 sys- 
tems, or to 8K on SPARC stations. 

This option is similar to -pic, but lets the global offset table span the range of 
32-bit addresses in those rare cases where there are too many global data objects 
for -pic. 
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-pipe Option This option directs cc to use pipes, rather than intermediate files, between com- 

pilation stages. (Veiy CPU-intensive.) 

-Qoption prog opt Option This option passes the option opt to the compiler phase prog. The option must 

be appropriate to that program and may begin with a minus sign, prog can be 
one of: as(l), cpp(l), inline, or ld(l). 

-Qpath pathname Option This option inserts a directory pathname into the search path used to locate com- 

piler components. This path will also be searched first for certain relocatable 
object files that are implicitly referenced by the compiler driver, for example 
*crt * . o and bb_link . o. This lets you choose whether or not to use default 
versions of programs invoked during compilation. 

-Qproduce sourcetype This option causes cc to produce source code of the type sourcetype. sourcetype 

Option can be one of the following: 

. c C source (from bb_count). 

. i Preprocessed C source from cpp. 

. o Object file from as. 

. s Assembler source (from ccom, inline(l), or c2. 


-R Option 


-S Option 


-sb Option 


target arch Option 


-temp= dir Option 


This option directs cc to merge the data segment with the text segment for 
as(l). Data initialized in the object file produced by this compilation is read- 
only, and (unless linked with Id -N) is shared between processes. This option 
is ignored when either -g or -go are used. 

This option directs cc to produce an assembly source file but not to assemble the 
program. 

This option generates extra symbol table information for the Sun Source Code 
Browser. This is an unbundled product that will be released based on 4.1. 

This option compiles object files for the specified processor architecture. Unless 
used in conjunction with one of the Sun Cross-Compilers, correct programs can 
be generated only for the architecture of the host on which the compilation is per- 
formed. tar get arch can be one of: 

-sun2 Produce object files for a Sun-2 system. 

-sun3 Produce object files for a Sun-3 system. 

-sun4 Produce object files for a Sun-4 and SPARC systems. 

This option sets the directory to contain temporary files generated during the 
compilation process to be dir. 
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-time Option 
-U name Option 

-w Option 

1.4. Environment 

FLOAT OPTION 


This option directs c c to report execution times for the various compilation 
passes. 

This option removes any initial definition of the cpp symbol name. This option 
is the inverse of the -D option. Multiple -U options may be given. 

This option directs cc to not print warnings. 


(Sun-3, Sun-4, and SPARC systems only.) When no floating-point option is 
specified, the compiler uses the value of this environment variable (if set). 
Recognized values are: f 68881, f fpa, f sky, f switch and f soft. 
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Accessing a Program’s Environment 


This chapter discusses two basic topics: 

□ How to get the arguments from the command line used to run a program. 

□ How to access environment variables. 

2.1. Basics — Accessing Assuming that you have written a C program, you might like to be able to get 

Command Line information from the command line when the user starts up the program. 

Arguments Although many SunOS system programs are run as filters — they obtain input 

from the standard input and send output to the standard output, sometimes you 
might like to be able to specify alternative files to operate upon, or to specify 
options on the command line to control the program’s behavior. 

When a C program is ran as a command, the arguments on the command line are 
made available to the program’s main ( ) function as its first two arguments, an 
argument count argc and an array argv of pointers to character strings that 
contain the arguments. By convention, argv [ 0 ] is the command name itself, 
so argc is always greater than 0. Since argv is not NULL-terminated, you 
must use argc when traversing it. 

The following program illustrates the mechanism: it simply echoes its arguments 
back to the terminal — this is essentially the echo command. 


( •; 

#include <stdio.h> 

main (argc, argv) /* echo arguments */ 

int argc; 
char *argv [ ] ; 

{ 

int arg_count; 


A 

for (arg_count = 1; arg count < argc; arg count ++) 

printf ("%s%c", argv [arg_count] , (arg_count<argc-l) ? ' ' 
exit (0) ; 

} 

V 

'\n'); 





argv is a pointer to an array whose elements are pointers to arrays of characters; 
each is terminated by \ 0, so they can be treated as strings. The program starts by 
printing argv [ 1 ] and loops until it has printed argv [ argc-1 ] . 
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2 . 2 . Basics — Accessing 
Eovironment 
Variables 


The argument count and the arguments are parameters to main, so if you want to 
keep them around for other routines to use, you must copy them to external vari- 
ables. 

The next topic is how to obtain values from a running program’s environment. 

You can ‘tailor’ your SunOS system environment by setting environment vari- 
ables, and these environment variables are accessible from a program. 

When a C program is started, three arguments are passed to its main function. 

In addition to argc and argv as described above, there is an array (named 
envp) of pointers to the character strings that comprise the environment. 

Each environment variable is a null-terminated character string of the form name 
= value that can be manipulated like any other character string. ( envp itself is 
also null-terminated.) 

Here is a short program to display all the environment variables: 


r . . .. — — 

♦include <stdio.h> 

A 

main (argc, argv, envp) 
int argc; 
char *argv[]; 

char *envp [ ] ; 

{ 


int 

env_count = 0; 


1 

} 

V 

while (envp [env_count] != NULL) { 
printf ("%s\n", envp [env_count] ) ; 
env_count++; 


exit ( 0 ) ; 

J 


If you save the above text as environ . c, you can compile and run it as fol- 
lows: 




tutorial % cc environ. c 
tutorial% a. out 
HOME=/usr/henry 
SHELL=/btin/csh 

PATH=/usr/doctools/bin : /usr/local : . : /usr/ucb : /bin : /usr/bin 

TERM=sun 

USER=henry 

EXINIT=set noai wrapmargin=16 para=IPLPPPQPLSLEDSDETSTEKSKEPSPEEQENLIpplpipbp 

WINDOW_PARENT=/dev/winO 

WINDOW_ME=/dev/win8 

WINDOW_GFX=/dev/win8 

tutor ial% 

v J 

#-sun 
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Accessing Environment While environ . c is somewhat useful, parsing the name = value pairs is rather 

Variable Using getenv() tedious, so there is a C library function called get env ( ) whose purpose is to 

get values from the environment. Here is the interface definition for get env () : 





N 


char 

*getenv (name) 



char 

*name; 


V 





Now we can compose a program that displays the value of a variable supplied as 
an argument on its command line: 

s — — ~ ~ — > 

/* gctenvO .c — obtain specified variable from environment */ 

# include <stdio.h> 

main (argc, argv) 

char *getenv ( ) ; 

int argc; 

char *a rgv [ ] ; 


char * variable ; 

/* Check any argument supplied */ 

if (argc < 2) { 

printf ("Usage : %s name\n", argvfO]); 
exit (1) ; 

} 

/* Search for the variable : */ 
if ((variable - getenv (argv [1] ) ) =— NULL) 
printf ("%s : no variable %s\n", argv[0], argvfl]);' 
else 

printf ("%s = %s\n", argvfl], variable); 
exit (0) ; 



After compiling this program, you can use it like this: 


tutorial% a. out PATH 

PATH = /usr/doctools/bin: /usr/local : . : /usr/ucb: /bin: /usr/bin 

tutorial% a. out nonesuch 

a . out : no variable nonesuch 

tutorial% a. out 

Usage: a. out name 

tutorial% 
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Processes 


The following section describes how to execute one program from within 
another. This makes it possible to use existing programs rather than always hav- 
ing to write new ones. 

The easiest way to execute a program from another is to use the standard library 
routine system ( ) . system ( ) takes one argument, a command string exactly 
as typed at the terminal (except for the newline at the end) and executes it — for 
instance, to timestamp the output of a program, and returns a status word. 



The in-memory formatting capabilities of sprint f ( ) are useful if you must 
build the command string from pieces. 

3.2. Low-Level Process If you’re not using the standard library, or if you need finer control over what 

Creation — execl ( ) happens, you will have to construct calls to other programs using more primitive 
and execv ( ) routines that the standard library’s system ( ) routine is based on 1 . 

The most basic operation is to execute another program without returning, by 
using the routine execl () . For example, you can display the date as the last 
action of a running program: 


3.1. The system () 

Function 


execl ("/bin/date", "date", NULL) ; 


j 


1 system ( ) uses Ibin/sh (the Bourne Shell) to execute the command string, so syntax specific to the C- 
Shell will not work. 


microsystems 


13 


Revision A of 27 March, 1990 





14 C Programmer’s Guide 


The arguments that you pass to execl ( ) are: 

1. The filename of the command that you want executed; you have to know 
where it is found in the file system. 

2 The second argument is conventionally the program name (that is, the last 
component of the file name), but this is seldom used except as a placeholder. 

3. If the command takes arguments, they are strung out in order, as a comma- 
separated list, after the program name (or its position). 

4. Following the arguments, the end of the list is marked by a NULL argument. 

The execl ( ) call overlays the existing program with the new one, runs that, 
then exits. There is no return to the original program. 

More commonly, a program falls into two or more phases that communicate only 
through temporary files. Here it is natural to start the second pass simply by an 
execl ( ) call from the first. 

The one exception to the rule that the original program never gets control back 
occurs when there is an error in performing the execl ( ) call itself, for example 
if the file can’t be found or is not executable. If you don’t know where date ( ) 
is located, you might try 




execl ("/bin/date", "date", NULL); 
execl ("/usr/bin/date", "date", NULL); 
fprintf (stderr, "Someone stole 'date'\n"); 

< 


A variant of execl ( ) called execv ( ) is useful when you don’t know in 
advance how many arguments there are going to be. The call is 


r 

A 

execv (filename, argp); 


V 

J 


where argp is an array of pointers to the arguments; the last pointer in the array 
must be NULL so execv ( ) can tell where the list ends. As with execl ( ) , 
filename is the file in which the program is found, and argp [ 0 ] is the name 
of the program. (This arrangement is identical to the argv array for program 
arguments.) 

Neither of these routines provides the niceties of normal command execution. 
There is no automatic search of multiple directories — you have to know pre- 
cisely where the command is located. Nor do you get the expansion of metachar- 
acters like < , > , * , ? and [ ] in the argument list. If you want these, use 
execl ( ) to invoke a shell sh(l), which then does all the work. Construct a 
string commandline that contains the complete command as it would have 
been typed at the terminal, then call 
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, 

execl ( "/bin/sh" , ”sh", "-c", commandline, NULL) ; 

s 


The shell is assumed to be at a fixed place, /bin/sh. Its argument -c says to 
treat the next argument as a whole command line, so it does just what you want. 
The only problem is in constructing the right information in commandline. 

3.3. Process Control — So far what we’ve talked about isn’t really all that useful by itself. Next we show 

fork() and wait() how to regain control after running a program with exe cl () orexecv(). 

Since these routines simply overlay the new program on the old one, to save the 
old one requires that it first be split into two copies; one of these can be overlaid, 
while the other waits for the new, overlaying program to finish. The splitting is 
done by a routine called fork ( ) : 


r 

^ 

proc id = fork( ); 


v 

J 


This call splits the program into two copies, both of which continue to run. The 
only difference between the two is the value of proc_id, the process id. In one 
of these processes (the child), proc_id is zero. In the other (the parent), 
proc_id is nonzero; it is the process number of the child. Thus the basic way 
to call, and return from, another program is 


, 

if (fork ( ) == 0) 

execl ("/bin/sh", "sh", "-c", cmd, NULL); /* in child */ 
/ 


And in fact, except for handling errors, this is sufficient. The f ork ( ) makes 
two copies of the program. In the child, the value returned by fork ( ) is zero, 
so it calls execl ( ) which does the command and then dies. In the parent, 
fork ( ) returns nonzero, so it skips the execl ( ) . If there is an error, fork ( ) 
returns -1. 

More often, the parent wants to wait for the child to terminate before continuing 
itself. This can be done with the function wait ( ) : 


r 

\ 

int status; 


if ( fork ( ) == 0) 


execl (...); 


wait (Sstatus) ; 


V 

J 


This still doesn’t handle any abnormal conditions, such as a failure of the 
execl ( ) or fork ( ) , or the possibility that there might be more than one child 
running simultaneously. The wait ( ) returns the process id of the terminated 
child, in case you want to check it against the value returned by f ork ( ) . 
Finally, this fragment doesn’t deal with any unusual behavior on the part of the 
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child (which is reported in status). Still, these three lines are the heart of the 
standard library’s system ( ) routine, which we’ll show in a moment. 

The status returned by wait ( ) encodes in its low-order eight bits the 
system’s idea of the child’s termination status; it is 0 for normal termination and 
nonzero to indicate various kinds of problems. The next higher eight bits are 
taken from the argument of the call to exit ( ) which caused a normal termina- 
tion of the child process. It is good coding practice for all programs to return 
meaningful exit status. (A program that does not explicitly call exit ( ) does 
not automatically return a 0 status.) 

When a program is called by the shell, the three file descriptors 0, 1, and 2 are set 
up to point to the right files (see Appendix A.l), and all other possible file 
descriptors are available for use. When this program calls another one, correct 
etiquette suggests making sure the same conditions hold. Neither fork ( ) nor 
exec affects open files in any way. If the parent is buffering output that must 
come out before output from the child, the parent must flush its buffers before the 
execl ( ) . Conversely, if a caller buffers an input stream, the called program 
will lose any information that has been read by the caller. 

3.4. Pipes A pipe is an I/O channel intended for use between two cooperating processes: 

one process writes into the pipe, while the other process reads from the pipe. The 
system looks after buffering the data and synchronizing the two processes. Most 
pipes are created by the shell, as in 


— 


A 

tutorial% Is 

1 pr 


V 




which connects the standard output of Is to the standard input of pr. Some- 
times, however, it is most convenient for a process to set up its own plumbing; in 
this section, we illustrate how the pipe connection is established and used. 

The system call pipe ( ) creates a pipe. Since a pipe is used for both reading 
and writing, two file descriptors are returned; the actual usage is like this: 


c 



int fd [2] ; 



stat = pipe (fd) ; 
if (stat == -1) 



/* there was an error . 

. */ 


V 


y 


f d is an array of two file descriptors, where f d [ 0 ] is the read side of the pipe 
and f d [ 1 ] is for writing. These may be used in read, write ( ) and 
close ( ) calls just like any other file descriptors. 

If a process reads a pipe which is empty, it waits until data arrives; if a process 
writes into a pipe which is full, it waits until the pipe empties somewhat. If the 
write side of the pipe is closed, a subsequent read will encounter end of file. 
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To illustrate the use of pipes in a realistic setting, let us write a function called 
popen (cmd, mode) , which creates a process cmd (just as system ( ) does), 
and returns a file descriptor that will either read to or write from that process, 
according to mode. That is, the call 



creates a process that executes the pr command; subsequent write ( ) calls 
using the file descriptor tout will send their data to that process through the 
pipe. 

popen ( ) first creates the pipe with a pipe ( ) system call; it then fork ( ) ’s to 
create two copies of itself. The child decides whether it is supposed to read or 
write, closes the other side of the pipe, then calls the shell (via execl ( ) ) to run 
the desired process. The parent likewise closes the end of the pipe it does not 
use. These closes are necessary to make end-of-file tests work properly. For 
example, if a child that intends to read fails to close the write end of the pipe, it 
will never see the end of the pipe file, just because there is one writer potentially 
active. 

/ — ' 

♦include <stdio.h> 

♦define READ 0 

♦define WRITE 1 

♦define tst (a, b) (mode == READ ? (b) : (a) ) 

static int popen_pid; 

popen (cmd, mode) 
char *cmd; 

int mode; 

{ 

int p [ 2 ] ; 

if (pipe (p) < 0) 
return (NULL) ; 

if ( (popen_pid = fork ( ) ) == 0) { 

close (tst (p [WRITE] , p [READ] ) ) ; 
close (tst (0, 1) ) ; 
dup (tst (p[ READ] , p [WRITE ] ) ) ; 
close (tst (p [READ] , p [WRITE])); 
execl ("/bin/sh", "sh", "-c", cmd, 0) ; 

_exit(l); /* disaster has occurred if we get here */ 

) 

if (popen_pid == -1) 
return (NULL) ; 

close (tst (p [READ] , p[WRITE])); 
return (tst (p [WRITE] , p [READ] ) ) ; 

) 

• 

The sequence of close ( ) ’s in the child is a bit tricky. Suppose that the task is 
to create a child process that will read data from the parent. Then the first 
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close ( ) closes the write side of the pipe, leaving the read side open. The lines 


f — — 

A 

close (tst (0, 1) ) ; 


dup (tst (p [READ] , p [WRITE] ) ) ; 


V 



are the conventional way to associate the pipe descriptor with the standard input 
of the child. The close ( ) closes file descriptor 0, that is, the standard input, 
dup () is a system call that returns a duplicate of an already open file descriptor. 
File descriptors are assigned in increasing order and the first available one is 
returned, so the effect of the dup ( ) is to copy the file descriptor for the pipe 
(read side) to file descriptor 0; thus the read side of the pipe becomes the standard 
input 2 . Finally, the old read side of the pipe is closed. 

A similar sequence of operations takes place when the child process is supposed 
to write to the parent instead of reading. You may find it a useful exercise to step 
through that case. 

The job is not quite done, for we still need a function pc lose ( ) to close the 
pipe created by popen ( ) . The main reason for using a separate function rather 
than close ( ) is that it is desirable to wait for the termination of the child pro- 
cess. First, the return value from pclose ( ) indicates whether the process suc- 
ceeded. Equally important when a process creates several children is that only a 
bounded number of unwaited-for children can exist, even if some of them have 
terminated; performing the wait ( ) lays the child to rest. Thus: 


♦include <signal.h> 

pclose (fd) /* close pipe fd */ 

int fd; 

{ 

register r, (*hstat) ( ), (*istat) ( ), (*qstat) ( ); 
int status; 

extern int popen_pid; 

close (fd) ; 

istat = signal (SIGINT, SIG_IGN) ; 
qstat = signal (SIGQUIT, SIG_IGN) ; 
hstat = signal (SIGHUP, SIG_IGN) ; 

while ( (r = wait (Sstatus) ) != popen_pid && r != -1) ; 

if (r == -1) 

status = -1; 
signal (SIGINT, istat); 
signal (SIGQUIT, qstat); 
signal (SIGHUP, hstat); 
return (status) ; 

} 


2 Yes, this is a bit tricky, but it’s a standard idiom. 
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The calls to signal ( ) make sure that no interrupts, etc., interfere with the 
waiting process; this is the topic of the next section. 

The routine as written has the limitation that only one pipe may be open at a 
time, because of the single shared variable popen_pid; it really should be an 
array indexed by file descriptor. A popen ( ) function, with slighdy different 
arguments and return value, is available as part of the standard I/O library dis- 
cussed later. As currently written, it shares the same limitation. 
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Signals — Interrupts and All That 


This chapter is concerned with how to deal gracefully with signals from the out- 
side world (like interrupts) and with program faults. Since there’s nothing very 
useful that can be done from within a C program about program faults, which 
arise mainly from illegal memory references or from execution of peculiar 
instructions, we’ll discuss only the outside world signals: interrupt and quit, 
which are generated from the keyboard, hangup, caused by hanging up the phone 
on dialup lines, and terminate, generated by the kill command. When one of 
these events occurs, the signal is sent to all processes which were started by the 
corresponding user — the signal terminates the process unless other arrange- 
ments have been made. In the quit case, a core image file is written for debug- 
ging purposes. 

signal ( ) is the routine which alters the default action, signal ( ) has two 
arguments: the first specifies the signal to be processed, and the second argument 
specifies what to do with that signal. The first argument is just a numeric code, 
but the second is either a function, or a somewhat strange code that requests that 
the signal either be ignored or that it be given the default action. The include file 
signal . h gives names for the various arguments, and should always be 
included when signals are used. Thus 


( 

#include <signal.h> 


signal (SIGINT, SIG_IGN) ; 

> 

means that interrupts are to be ignored, while 

signal (SIGINT, SIG_DFL) ; 



J 


restores the default action of process termination. In all cases, signal ( ) 
returns the previous value of the signal. The second argument to signal ( ) 
may instead be the name of a function (which must be declared explicitly if the 
compiler hasn’t seen it already). In this case, the named routine is called when 
the signal occurs. Most commonly this facility is used so that the program can 
clean up unfinished business before terminating, for example to delete a tem- 
porary file: 
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, : — . 

♦include <signal .h> 

main ( ) 

{ 

int onintr ( ) ; 

FILE *tempf ile, *fopen () ; 

if (signal (SIGINT, SIG_IGN) != SIG_IGN) 
signal (SIGINT, onintr); 

/* Process . . . */ 

exit (0) ; 

} 

onintr ( ) 

{ 

unlink (tempf ile) ; 
exit (1) ; 

} 

V 

Why the test and the double call to signal ( ) ? Recall that signals, like inter- 
rupts, are sent to all processes started from a particular user. Accordingly, when 
a program is to be run non-interactively (started with &), the shell turns off inter- 
rupts for it so it won’t be stopped by interrupts intended for foreground 
processes. If this program began by announcing that all interrupts were to be 
sent to the onintr ( ) routine regardless, that would undo the shell’s effort to 
protect it when run in the background. 

The solution, shown above, is to test the state of interrupt handling, and to con- 
tinue to ignore interrupts if they are already being ignored. The code as written 
depends on the fact that signal ( ) returns the previous state of a particular sig- 
nal. If signals were already being ignored, the process should continue to ignore 
them; otherwise, they should be caught. 

A more sophisticated program may wish to intercept an interrupt and interpret it 
as a request to stop what it is doing and return to its own command processing 
loop. Think of a text editor — interrupting a long display should not terminate 
the edit session and lose the work already done. The outline of the code for this 
case may be written like this: 
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♦include <signal .h> 

♦include <setjmp.h> 
jmp_buf sjbuf; 

onintr ( ) 

{ 

printf(”\nlnterrupt\n"); 

long jmp (sjbuf ) ; /* return to saved state */ 

} 

main ( ) 

1 

int (*istat) ( ), onintr ( ); 

istat = signal (SIGINT, SIG_IGN) ; /* save old status */ 

set jmp (sjbuf ) ; /* save current stack position */ 

if (istat ! = SIG_IGN) 

signal (SIGINT, onintr); 

/* main processing loop */ 

} 

v __ > 


The include file set jmp . h declares the type jmp_buf — an object in which a 
process’s state can be saved, sjbuf is such an object. The set jmp ( ) routine 
then saves the state. When an interrupt occurs the onintr ( ) routine is called, 
which can display a message, set flags, or whatever, long jmp ( ) takes as argu- 
ment an object set by set jump ( ) , and restores control to the location following 
the call to set jump ( ) , so control (and the stack level) will pop back to the 
place in the main routine where the signal is set up and the main loop entered. 
Notice, by the way, that the signal gets set again after an interrupt occurs. 

Some programs that want to detect signals simply can’t be stopped at an arbitrary 
point, for example in the middle of updating a linked list. If the routine called 
when a signal occurs sets a flag and then returns instead of calling exit ( ) or 
long jmp () , execution continues at the exact point it was interrupted. The 
interrupt flag can then be tested later. 

There is one difficulty associated with this approach. Suppose the program is 
reading the standard input when the interrupt is sent. The specified routine is 
duly called; it sets its flag and returns. If it were really true, as we said above, 
that ‘execution resumes at the exact point it was interrupted,’ the program would 
continue reading stdin until the user typed another line. This behavior might 
well be confusing, since the user might not know that the program is reading; he 
presumably would prefer to have the signal take effect instantly. The method 
chosen to resolve this difficulty is to terminate the read when execution resumes 
after the signal, returning an error code which indicates what happened. 

Thus programs which catch and resume execution after signals should be 
prepared for ‘errors’ which are caused by interrupted system calls. 
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The ones to watch out for are read ( ) , wait ( ) , and pause ( ) . A program 
whose onintr ( ) routine just sets intf lag, resets the interrupt signal, and 
returns, should usually include code like the following when it reads the standard 
input: 


— 

N 

if (getchar() == EOF) 


if (intflag) 


/* EOF caused by interrupt */ 


else 


/* true end-of-file */ 


V 

/ 


A final subtlety to keep in mind becomes important when catching signals is 
combined with executing other programs. Suppose a program catches interrupts, 
and also includes a method (like T in ex and vi) whereby other programs can be 
executed. Then the code should look something like this: 


/ 




if ( fork ( ) == 

0) 



execl (...) 

/ 



signal (SIGINT, 

SIG IGN) ; 

/* ignore interrupts */ 


wait (& status) ; 

/* 

until the child is done */ 


signal (SIGINT, 

onintr) ; 

/* restore interrupts */ 


V 



J 


Why is this? Again, it’s not obvious, but not really difficult. Suppose the pro- 
gram you call catches its own interrupts. If you interrupt the subprogram, it will 
get the signal and return to its main loop, and probably read from stdin. But 
the calling program will also pop out of its wait for the subprogram and read 
from stdin. Having two processes reading the same input is very unfortunate, 
since the system figuratively flips a coin to decide which should get each line of 
input. A simple way out is to have the parent program ignore interrupts until the 
child is done. This reasoning is reflected in the standard I/O library function 
system: 
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♦include <signal.h> 


system (s) /* run command string s */ 

char *s; 

HI I 

int status, pid, w; 

register int (*istat) ( ), (*qstat) ( ); x 

if ((pid = fork( )) == 0) { 

execl ("/bin/sh", "sh", "-c", s, 0); 
_exit (127) ; 

} 

istat = signal (SIGINT, SIG_IGN) ; 

qstat = signal (SIGQUIT, SIG_IGN) ; 

while ( (w = wait (Sstatus) ) != pid && w != 

if (w == -1) 

status = -1; 
signal (SIGINT, istat); 
signal (SIGQUIT, qstat); 
return (status) ; 


As an aside on declarations, the function signal ( ) obviously has a rather 
strange second argument. It is in fact a pointer to a function, and this is also the 
type of the signal routine itself. The two values SIG_IGN and SIG_DFL have 
the right type, but are chosen so they coincide with no possible actual functions. 
For the enthusiast, here is how they are defined for the Sun system — the 
definitions should be sufficiently ugly and nonportable to encourage use of the 
include file. 


< 

♦define SIG DFL 

(void 

(*) 0)0 

#def ine SIG_IGN 

s 

(void 

(*) 0)1 
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The Standard I/O Library 


Input and output are, strictly speaking, not an intrinsic part of the C programming 
language. Rather, the input and output functions are supplied by a library which 
comes with each implementation. 

This chapter describes the Standard I/O Library available to C programmers on 
Sun workstations. 


5.1. The Standard I/O 
Library 


The standard I/O library was designed with the following goals in mind: 

1. It must be as efficient as possible, both in time and in space, so that there 
will be no hesitation in using it, no matter how critical the application. 

2. It must be simple to use, and also free of the magic numbers and mysterious 
calls whose use mars the understandability and portability of many programs 
using older packages. 

3. The interface provided should be applicable on all machines, whether or not 
the programs which implement it are directly portable to other systems, or to 
non-Sun machines running a version of UNIX. 


5.2. Using the Standard I/O The stdio . h routines are in the normal C library, so no special library argu- 

Llbrary ment must be declared in your program for linking. All names in the include file 

intended only for internal use begin with an underscore (_) to reduce the possi- 
bility of collision with a user name. The names intended to be visible outside the 
package are listed in Table 5-1. 

The routines in this package offer the convenience of automatic buffer allocation 
and output flushing where appropriate. 


The names stdin, stdout, and 
stderr are constants and may not 
be assigned values. They 
correspond to file descriptors 0, 1 
and 2, respectively. 


Any program which uses the Standard I/O Library must have the following line 
in the program source text, before using any of the functions in the library. 


, 

tinclude <stdio.h> 

v 


Putting this include statement in your program defines some macros and vari- 
ables for the program. 
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Table 5-1 Standard HO Library Names Accessible to User Programs 


Name 

Description 

stdin 

The name of the standard input file. This file is automatically connected at program 
startup time, and is the place from which a program reads its input. 

stdout 

The name of the standard output file. This file is automatically connected at program 
startup time, and is the place to which a program writes its output. 

stderr 

The name of the standard error file. This file is automatically connected at program 
startup time, and is the place to which a program writes any error or diagnostic responses 
which should not clutter up the standard output. 

EOF 

is actually the value -1. EOF is returned by the read routines upon encountering end-of-file 
or error conditions. 

NULL 

is a notation for the null pointer. Functions whose values are pointers return NULL to indi- 
cate an error. 

FILE 

is an abbreviation for the declaration: struct iob and is a useful notation when declar- 
ing a pointer to a stream. 

BUFSIZ 

is the size suitable for a user-supplied input-output buffer. BUFSIZ is usually 1024. See the 
setbuf ( ) function described below. 


The functions getc ( ) , getchar ( ) , putc ( ) , putchar ( ) , feof () , f er- 
ror ( ) , and f ileno ( ) are all defined as macros. Their descriptions appear 
later in this chapter. They are mentioned here to indicate that they cannot be 
redeclared. In addition, because they are macros and not functions, they cannot 
be passed as arguments to other functions, nor can their addresses be taken. 

The ‘Standard I/O Library’ is a collection of routines intended to provide 
efficient and portable I/O services for most C programs. The standard I/O library 
is available on each system that supports C, so programs that confine their system 
interactions to its facilities can be transported from one system to another essen- 
tially without change. 

This chapter describes the basics of the standard I/O library. Following chapters 
contain a fuller description of the capabilities and calling conventions of the 
functions in it. 

You could do I/O by calling the system routines directly. However, there is a 
‘standard I/O package’ that provides a high-level I/O access mechanism. This 
and the following chapters discuss the functions available in the standard I/O 
package. (An appendix discusses the raw interface to the operating system.) In 
general, you can get by using the standard I/O package and never need to use the 
raw system calls. 

The standard I/O package provides access to files in the system through a collec- 
tion of file descriptors that refer to structures for managing I/O buffering. The 
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first part of the discussion in this chapter describes those file descriptors that are 
defined automatically. Following sections describe how to get your own descrip- 
tors connected to files in the system. 

5.3» Tftae Standard Input Three files are connected automatically when a SunOS program starts up. These 

and Standard Output files are called the standard input ( stdin) , the standard output ( stdout ) , 

and the standard error ( stderr ) . 

The very simplest standard I/O call for output is to use putchar ( c ) to put the 
character c on the standard output, which is normally the user’s screen. 

If the user redirected the standard output by using the > syntax on the command 
line, the standard output is redirected. For example, if you typed: 


r 

tutorial% prog > outputfile 


v 

) 


on the command line, the standard output from prog is written to outputfile and 
the program is unaware that the standard output is going to a file instead of the 
screen, outputfile is created if it doesn’t exist; if it already exists, its previous 
contents are overwritten. 

Similarly, you can send the standard output from a program through a pipe with 
the command line: 


f 

tutorial% prog | otherprog 

> 


/ 


and the standard output of prog goes into the standard input of otherprog. 

The simplest input mechanism is to read from the ‘standard input,’ which is gen- 
erally the user’s keyboard. The function get char ( ) returns the next input 
character each time it is called. A file may be substituted for the keyboard by 
using the < convention (input redirection): if prog uses get char ( ) , the com- 
mand line 


Reading Standard Input and 
Writing Standard Output 


/ 


tutorial% prog < filename 



✓ 


makes prog read from the file specified by filename, instead of from the key- 
board. prog itself need know nothing about where its input is coming from. 
This is also true if the input comes from another program through the pipe 
mechanism: 


f 

tutorial% otherprog | prog 

" > 

V 

/ 


provides the standard input for prog from the standard output (see above) of 
otherprog. 
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get char ( ) returns the value EOF when it encounters the end of file (or an 
error) on whatever you are reading. The value of EOF is normally defined to be 
-1, but it is unwise to take any advantage of that knowledge. As will become 
clear shortly, this value is automatically defined for you when you compile a pro- 
gram, and need not be of any concern. 

The function print f ( ) , which formats output in various ways, uses the same 
mechanism as put char ( ) does, so calls to print f ( ) and put char ( ) may 
be intermixed in any order; the output appears in the order of the calls. 

Similarly, the function scant ( ) provides for formatted input conversion, 
scant ( ) reads the standard input and breaks it up into strings, numbers, etc., as 
desired, scant ( ) uses the same mechanism as getchar ( ) , so calls to them 
may also be intermixed. 

Many programs read only one input and write one output; for such programs I/O 
with getchar ( ) , putchar ( ) , scant ( ) , and printf ( ) may be entirely 
adequate, and it is almost always enough to get started. This is particularly true 
if the SunOS pipe facility is used to connect the output of one program to the 
input of another. For example, the following program strips out all ASCII control 
characters from its input (except for newline and tab). 


#include <stdio . h> 

main O /* ccstrip: strip non-graphic characters */ 

{ 

■ while ((c = getchar ()) != EOF) 

if < (c >= ' ' && c < 0177) | | c == ' \t' I I c == ' \n' ) 

putchar (c) ; 

exit (0) ; 


You would use the program like this: 


tutorial% cat infile | ccstrip > output 


If you need to treat multiple files, you can use cat to collect the files for you: 



and thus avoid learning how to access files from a program. By the way, the call 
to exit () at the end is not necessary to make the program work properly, but it 
assures that any caller of the program will see a normal termination status (con- 
ventionally 0) from the program when it completes. Section 3.3 discusses return- 
ing status in more detail. 
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5-4. Error Handling — stderr is assigned to a program in the same way that stdin and stdout are. 

stderr and exit ( ) Output written on stderr appears on the user’s terminal even if the standard 

output is redirected, unless the standard error is also redirected. For example, the 
command wc writes its diagnostics on stderr instead of stdout so that if one 
of the files can’t be accessed for some reason, the message finds its way to the 
user’s terminal instead of disappearing down a pipe or into an output file. 

The argument of exit ( ) is made available to whatever process called the pro- 
cess that is exiting (see Section 3.3), so the success or failure of the program can 
be tested by another program that uses this one as a subprocess. By convention, 
a return value of 0 indicates that all is well; nonzero values indicate abnormal 
situations. 

exit ( ) itself calls f close ( ) for each open output file, to flush out any buf- 
fered output, then calls a routine named exit ( ) . The function _exit ( ) ter- 
minates the program immediately without any buffer flushing; it may be called 
directly if desired. 

5-5. Miscellaneous I/O The standard I/O library provides several other I/O functions besides those illus- 

Functions trated above. 

Normally, output with putc ( ) and such is buffered — use f flush (fp) to 
force it out immediately. 

f scanf ( ) is identical to scanf ( ) , except that its first argument is a file 
pointer that specifies the file from which the input comes; it returns EOF at end of 
file. 

The functions sscanf ( ) and sprintf ( ) are identical to f scanf ( ) and 
f print f ( ) , except that the first argument names a character string instead of a 
file pointer. The conversion is done from the string for sscanf ( ) and into it 
for sprintf () , and no input or output is done. 

f gets (buf , size, fp) copies the next line from stream fp, up to and 
including a newline, into buf ; at most size-1 characters are copied; it returns 
NULL at end of file, f put s (buf , f p ) writes the string in buf onto file or 
stdio stream fp. 

Note The "stream" referred to above is not related to UNIX System V streams. 

The functions gets ( ) and puts ( ) work like f gets ( ) and fputs ( ) , but 
they default to operation with stdin and stdout, respectively. The macro 
unget c (c , f p) ‘pushes back’ the character c onto the input stream fp; a sub- 
sequent call to getc ( ) , f scanf ( ) , and so on will encounter c. Only one 
character of pushback is guaranteed to work. 
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6 


Accessing Files Through Standard I/O 


Previous examples have all read the standard input and written the standard out- 
put, which we have assumed are magically predefined. The next step is to write a 
program that accesses a file that is not already connected to the program. One 
simple example is wc, which counts the lines, words and characters in a set of 
files. For instance, the command 



displays the number of lines, words and characters in x . c and y . c and the 
totals. 

The question is how to arrange for the named files to be read — that is, how to 
connect the filenames to the I/O statements which actually read the data. 

The rules are simple — you have to open a file by the standard library function 
fopen() before it can be read from or written to. fopen() takes an external 
name (like x . c or y . c ) , does some housekeeping and negotiation with the 
operating system, and returns a pointer which must be used in subsequent reads 
or writes of the file. 

This pointer, called a FILE pointer, to a structure which contains information 
about the file, such as the location of a buffer, the current character position in 
the buffer, whether the file is being read or written, and the like. The only 
declaration needed for a file pointer is exemplified by 


/ 


A 

FILE 

*fp, *fopen(); 


V 


V 


This says that f p is a pointer to a FILE, and f open ( ) returns a pointer to a 
FILE. 

The actual call to f open ( ) in a program has the form: 


/ 

A 

fp = fopen(name, mode); 


v 

V 
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The next thing needed is a way to read or write the file once it is open. There are 
several possibilities, of which getc ( ) and putc ( ) are the simplest, getc ( ) 
returns the next character from a file; it needs the file pointer to tell it what file. 
Thus 


( 'i 

c = getc(fp) 

\ 


places in c the next character from the file referred to by fp; it returns EOF when 
it reaches end of file, putc ( ) is the inverse of getc ( ) : 

( 'i 

putc(c, fp) 


puts the character c on the file f p and returns c as its value, getc ( ) and 
putc ( ) return EOF on error. 

When a program is started, three streams are opened automatically, and file 
pointers are provided for them. These streams are the standard input, the stan- 
dard output, and the standard error output; the corresponding file pointers are 
called stdin, stdout, and stderr. Normally these are all connected to the 
terminal, but may be redirected to files or pipes as described in Section 5.3 . 
stdin, stdout and stderr are predefined in the I/O library as the standard 
input, output and error files; they may be used anywhere an object of type 
FILE * can be. They are constants, however, nor variables, so don’t try to 
assign to them. 

With some of the preliminaries out of the way, we can now write wc . The basic 
design is one that has been found convenient for many programs: if there are 
command-line arguments, they are processed in order. If there are no arguments, 
the standard input is processed. This way the program can be used standalone or 
as part of a larger activity. 
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# include <stdio.h> V- - i:-:: : : : ^ 

main(argc, argv) /* wc: count lines, words, chars */ 
int argc; 
char *argv [ ] ; 

{ 

int c, i, inword; 

FILE *fp, *fopen(); 

long linect, wordct, charct; 

long tlinect =0, twordct = 0, tcharct = 0; 

i = 1; 
fp = st din; 
do { 

if (argc > 1 && ( fp=f open (argv [i] , "r") ) == NULL) { 
fprintf (stderr, "wc: can't open %s\n", argv [i] ) ; 
continue; 

} 

linect = wordct - charct ~ inword =0; 
while ( (c = getc (fp) ) != EOF) { 

; charct ++; 
if (c =- ' \n' ) 
linect ++; 

if (c == ' ' I I c == ' \t ' II c == '\n') 
inword = 0; 

else if (in word == 0) { 
inword =1; 
wordct++; 

) 

} 

printf ("%71d %71d %71d", linect, wordct, charct); 
printf (argc > 1 ? " %s\n" : "\n", argv[i]); 
f close (fp) ; 

; tlinect += linect; 
twordct += wordct; 
tcharct +— charct; 

} while (++i < argc); 
if (argc > 2) 

printf ("%71d %71d %71d total\n", tlinect, twordct, tcharct) ; 
exit (0) ; 

} 


The function fprintf ( ) is identical to printf ( ) , except that the first argu- 
ment is a file pointer that specifies the file to be written. 

The function f close ( ) is the inverse of f open ( ) ; it breaks the connection 
between the file pointer and the external name that was established by f open ( ) , 
freeing the file pointer for another file. There is a limit, depending on available 
memory, on the number of files that a program may have open simultaneously, 
so you should free things when they are no longer needed. There is another rea- 
son to call f close ( ) on an output file — it flushes the buffer in which 
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putc ( ) collects output. Each file is closed automatically when a program ter- 
minates normally. 

6.1. Accessing Files Several stdio routines needed to perform file I/O housekeeping and access 

functions are described below: 

f open ( ) — Open a File 


filename is a character string that specifies the name of the file. 

type is a character string (not a single character) that specifies the access 

mode of the file, type can be one of: 
r reopen the file for reading, 
w reopen the file for writing, 
a reopen the file for appending. 

f open ( ) opens the file and, if needed, allocates a buffer for it. In 
addition, each mode specification may be followed by a + sign to 
open the file for reading and writing. Both reads and writes may be 
used on read/write streams, with the limitation that an f seek, 
rewind ( ) , or reading end-of-file must be used between a read and 
a write or vice versa. The value returned is a file pointer. If it is 
NULL the attempt to open the file failed. 

Figure 6-1 Example of Using fopenQ 

demo ( ) 

jiiii iiiiiii i I 

FILE *fopen ( ) ; 

/* open the file */ 

if ((fp -fopen <"/usr/lib/tmac.tmac.e", "r") ) == NULL) 
printf {"Can't open /usr/lib/tmac/tmac.e\n") ; 

else 

... go ahead and work with the file 
} / * end of the demo function */ 

If a file that you open for writing or appending does not exist, it is created (if pos- 
sible). Opening an existing file for writing discards the old contents. Trying to 
read a file that does not exist is an error, and there may be other causes of error as 
well (like trying to read a file without read permission). If there is an error, 
fopen ( ) returns the null pointer value NULL. 
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The stream named by ioptr is closed, if necessary, and then reopened as if by 
f open ( ) . If the attempt to open fails, NULL is returned; otherwise ioptr is 
returned, which now refers to die new file. Often the reopened stream is stdin 
or stdout. The filename and type parameters are as for f open ( ) . 
filename is a character string that specifies the name of the file. 

type is a character string (not a single character) that specifies the access 

mode of the file, type can be one of: 
r reopen the file for reading, 
w reopen the file for writing, 
a reopen the file for appending. 

ioptr is a pointer to the existing stream which is to be closed. 

The value of the f reopen ( ) function is a file pointer. If the value of the file 
pointer is NULL, the attempt to open the file failed. 


Figure 6-2 Example of Using freopen() 



demo () 
{ 


FILE *f reopen () ; 

/* re-open the file */ 

if ( (fp = freopen ("/lib/ftr.cterrs", "r", fp) ) == NULL) 
printf ("Can't open /lib/ftncterrs\n") ; 

else 

... go ahead and work with the file 
/* end of the demo function */ 


fflush() — Flush Stream The fflush() function flushes the stream buffer for a given file pointer. The 
Buffer interface to f flush () is: 



Any buffered information on the output stream designated by ioptr is written 
out to the file. Common use is to f flush ( stdout ) so that the prompt 
appears immediately. 

Output files are normally buffered if they are not directed to a screen, alwaysst- 
doutis The stderr file usually starts off unbuffered, and remains unbuffered 
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f close ( ) 


setbuf ( ) 

File I/O 


unless the setbuf ( ) function is used, or unless the file is reopened. 
Close A File The f close () function closes an open file. The interface definition is: 


/ \ 

fclose (ioptr) 

FILE *ioptr; 

V 


The file designated by ioptr is closed, after any buffers associated with that 
file have been written out. 

Any buffers allocated to the file are freed. 

When a C program terminates normally (in a controlled fashion), fclose ( ) 
requests are issued automatically. 

Set Buffer for The setbuf ( ) function sets up a buffer for an open file. The user can desig- 
nate a buffer different from the one which the run-time library chooses, or the 
user can select no buffer at all. The interface to setbuf ( ) is: 


r 

\ 

setbuf (ioptr, buf) 


FILE *ioptr; 


char *buf; 


V 



The setbuf ( ) function is used after a file is opened, but before any I/O 
transfers have been made to that file. 

If the buf parameter is NULL, the stream becomes unbuffered. Otherwise, the 
buffer supplied is used. The buffer buf must be a sufficiently large character 
array. The usual way to assure this is to declare the buffer: 


N 

char buf [BUFSIZ] ; 

v / 


Here’s an example of setbuf ( ) usage: 
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Figure 6-3 Example of JJ sing setbuf() 



f ± leno ( ) — Obtain File The f ileno ( ) function returns an integer value which is the file descriptor 

Descriptor associated with the file. 



f ileno ( ) is typically used when a file has been previously opened with 
f open ( ) but you want to use a function on it that requires a file descriptor 
instead of a file pointer. 

Here’s an example of f ileno ( ) usage: 

Figure 6-4 Example of Using fileno() 
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rewind ( ) 

Stream 


Rewind a The rewind ( ) function rewinds the stream designated by the ioptr param- 

eter. 




rewind (ioptr) 

FILE *ioptr; 

v 


If you want to rewind a file for reading, use f reopen ( ) . 
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Character I/O 


This section describes those macros and functions which are concerned with 
reading and writing characters from and to streams. 

get c ( ) Macro — Get a The getc ( ) macro gets a character from a file. The definition is: 

Character from a File 



The getc ( ) macro obtains the next character from the stream designated by 
f p. f p is a file pointer such as is returned by the f open ( ) function, or is a 
name such as stdin. 

When the end of file is reached, the integer EOF is returned. The character \ 0 is 
a valid character from getc ( ) . 

Note that getc ( ) is a macro, not a function. 
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fgetc() Function - 
Character from File 


Figure 7-1 Example of U sing getc() 

C ' ” ' ' —————— —— -s 

main (argc, argv) 

int argc;: 

char *argv []; 

l 

FILE *fp; 

■ int num_char = 0; 
int c; 

if ((fp = fopen (argv [1], "r") ) == NULL) 
printf ("Can't open %s\n", argv [ 1 ] ) ; 

else 

/* count characters in a file */ 
while (getc(fp) != EOF) 
num char++; 


} /* end of the count function */ 

v l I iiilii ... :> 


Get The f getc ( ) function obtains a single character from a file. The interface 

definition is: 


— 

int fgetc(ioptr) 

FILE *ioptr; 

V > 


f getc ( ) obtains the next character from the stream designated by ioptr. 
ioptr is a file descriptor such as is returned by the fopen ( ) function, or is a 
name such as stdin. 

When the end of file is reached, the integer EOF is returned. The character \ 0 is 
a valid character from f getc ( ) . 

f getc () is a genuine function, as opposed to the getc ( ) macro. This means 
that f get c ( ) can be pointed to and passed as an argument to another function. 
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Figure 7-2 Example of Using f getc ( ) 



get char ( ) Macro — Get a The get char ( ) macro obtains a single character from the standard input. The 

Character from Standard interface to get char ( ) is: 

Input 


The get char ( ) macro is a shorthand notation for 



Note that get char ( ) is a macro, not a function. 


Figure 7-3 Example of Using getchar() 
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f gets ( ) — Read a String The fgets ( ) function reads a string from a specified file. The interface 
from a File definition is: 


/ 

A 

char *fgets(s, n, ioptr) 


char *s; 


int n ; 


FILE *ioptr; 


V 

y 


The fgets ( ) function reads up to n-1 characters from the stream designated 
by ioptr into the character array pointed to by s. 

Note Be careful that s can accomodate n characters! 

The read terminates when a newline character is read. The newline character is 
placed in the buffer. The last character read is always followed by a null charac- 
ter in the character array. 

The fgets ( ) function returns its first argument, or NULL if an error or an end 
of file was encountered. 


Figure 7-4 Example of Using fgets () 


main (argc, argv) 
int argc; 
char *argv [ ] ; 

{ 

FILE *fp; 

char line [256]; 

int num_line = 0; 

if ( (fp = fopen (argv [1], "r") ) —NULL) 
printf (’’Can' t open %s\n", argv [1]); 

else 

/* count lines in a file */ 
while ((fgets (line, 256, fp) ) != NULL) 
num_line++; 

1 /* end of the count function */ 

V. .. . .... : — 

unget c ( ) — Push a The unget c ( ) function pushes a single character back onto a stream. The 

Character Back on a Stream interface definition is: 


r - ■ - > 

ungetc(c, ioptr) 
char c; 

FILE * ioptr; 

v __ 


The ungetc ( ) function pushes the character argument, c, back onto the input 
stream designated by ioptr. 
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Only one character may be pushed back between two reads. 
Figure 7-5 Example of Using ungetc() 



put c ( ) Macro — Put a The putc ( ) macro puts a single character to a specified file. The interface 

Character to a File definition is: 



The putc ( ) macro writes the character c onto the output stream designated by 
iopt r, where ioptr is a file pointer such as is returned by the f open ( ) func- 
tion, or is a name such as stdout or stderr. 

The character c is normally returned as a value from the macro, but if an error 
occurs during the transfer, the value EOF is returned. 

Note that putc ( ) is a macro, not a function. 
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Remember that putc ( ) normally buffers its output; terminal I/O is not properly 
synchronized unless this buffering is defeated. Use f flush to do this. 

f put c ( ) Function — Put a The fputc ( ) function outputs a single character to a specified file. The inter- 
Character to a File face definition is: 



The fputc ( ) function writes the character c onto the stream designated by 
ioptr, where ioptr is a file pointer such as is returned by the f open ( ) func- 
tion, or is a name such as stdout or stderr. 

The character c is normally returned as a value from the function, but if an error 
occurs during the transfer, the value EOF is returned. 

fputc ( ) is a genuine function, as opposed to the putc ( ) macro. This means 
that fputc ( ) can be pointed to, passed as an argument to another function, and 
so on. 

Figure 7-6 Example of Using fputc () 



put char ( ) Macro — Put a The put char ( ) macro puts a single character to the standard output file. The 

Character to Standard Output interface definition is: 



The putchar ( ) macro is a shorthand notation for 
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Note that put char ( ) is a macro, not a function. 
Figure 7-7 Example of Using put char () 



f put s ( ) — Put a String to a f put s ( ) writes a character string to a file. The interface definition is: 

File 



The fputs ( ) function writes the null-terminated character string s (which is a 
character array) to the stream designated by iopt r. 

f put s ( ) does not append a newline to the string. 

f put s ( ) does not return a value. 

Figure 7-8 Example of Using fputs () 



f eof ( ) — Test for End Of The f eof ( ) function checks for an end of file on a specified file. The interface 
File definition is: 
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The f eof ( ) function returns a nonzero value if an end-of-file has occurred on 
the stream designated by ioptr. 

7.1. Formatted Input and The C run-time library provides extensive facilities for formatted conversions of 
Output character strings to numeric data, and for the formatted conversion of numeric 

data to character strings. Conversions can be done between the standard input or 
standard output, an arbitrary file, or strings in memory. The subsections follow- 
ing give detailed descriptions of these facilities. 

Formatted Output There are three variations of the formatted output functions: they are all similar 

Conversions in their actions, the only difference being the destination of the formatted string. 


— 



printf (format, arg^. 

- •) 


char * format; 



v 




printf ( ) writes the formatted string to the standard output. 


— 


\ 

fprintf (ioptr, format, arg^ . 

. .) 


FILE *ioptr; 



char *format; 



V 


y 


fprintf ( ) writes the formatted string to the file designated by ioptr. 


/ — — 

\ 

sprintf (s, format, arg^ . . .) 


char *s; 


char *format; 


V 

J 


sprintf ( ) stores the formatted string into a character string (character array) 
in memory. 

Formatted Input Conversions The scant ( ) , f scant ( ) , and sscanf ( ) functions are the equivalents of the 

printf ( ) functions described above, except that the scant ( ) functions per- 
form conversions from character strings to data in memory. They are thus used 
for reading formatted information instead of writing it. 

There are three variations of the scant ( ) function: 


f 


scant (format, arg^, . . .) 


char *format; 


v_ 

> 


scant ( ) reads the formatted string from the standard input. 
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The Format Control 
Templates 


Conversion Specifications 



A 

f scanf (ioptr, format, arg^ . . .) 


FILE *ioptr; 


char *format; 


— 

V 


f scanf ( ) reads the formatted string from the file designated by ioptr. 


r 




sscanf (s. 

format, arg , . 

• •) 


char 

*s ; 



char 

* format; 



\ 





sscanf ( ) gets the formatted string from a character string (character array) in 
memory. 

All six print and scan functions accept a format argument, followed by 
zero or more arg^ arguments. 

The format argument is a template, in the form of a character string. The for- 
mat character string consists of two kinds of objects: 

□ It can contain fixed parts which are sent to the destination unchanged 
(for formatted output) or match characters in the input source (for 
formatted input). 

□ It can also contain conversion specifications, which indicate how the 
corresponding arg are to be converted and placed into the final 
formatted output string, or recognized in the input, and converted to 
internal form and placed in the location pointed to by the arg^. 

A conversion specification is marked by a percent sign %, and ends with a 
conversion character. In between the % sign and the conversion character, there 
can be modifiers. These modifiers are described below after the descriptions of 
the conversion characters. Any character in a format that is not part of a conver- 
sion specification is passed or recognized as is. 

Here is a print f ( ) call with a simple string template and no conversion 
specifications: 


printf ("Calling occupants of interactive space\n") ; 

< 


This example simply prints the quoted string on the standard output. 

The following paragraphs describe the effects of the conversion characters. 

There are also modifiers for the conversion specifications, and these are described 
below. 
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d — Decimal Conversion A d conversion character specifies that the associated argument is converted to 

(or from) decimal notation. 

Figure 7-9 Example of d Format Specification 



When the above program is ran, it generates the result: 



o — Octal Conversion A conversion character of o specifies that the associated argument is converted to 

(or from) unsigned octal notation. The resulting output string does not contain a 
leading zero. It is the responsibility of the programmer to insert the leading zero 
"manually" as part of the format string, if that is what is required. 

Figure 7-10 Example of o Format Specification 



When the above program is ran, it generates the result: 



Note that the program explicitly places the digit "0" in the generated number. 

x — Hexadecimal Conversion A conversion character of x specifies that the associated argument is converted to 

(or from) unsigned hexadecimal notation. The resulting output string does not 
contain a leading "Ox". It is the responsibility of the programmer to insert the 
leading "Ox" "manually", as part of the format string, if that is what is required. 
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Figure 7-11 Example of x Format Specification 

( — — — _ — :: 'i 

main () 

{ 

int data = 25; 

/ i printf ("The value of data is: Qx%x\n", data); 

} /* End of the demo function */ 

. : :: 

When the above program is run, it generates the result: 

\ 

The value of data is: 0x19 

v ./ 

Note that the programmer explicitly coded the "Ox" in the generated number. 


h — Short Conversion on Input A conversion character of h is used only for formatted input, and specifies that 

Only the associated argument is a pointer to a short int data item. 


xi — Unsigned Decimal 
Conversion 


A conversion character of u specifies that the associated argument is converted to 
(or from) unsigned decimal notation. 


Figure 7-12 Example of u Format Specification 


r 

main () 

{ 

int data = -25; 

printf ("The value of data is: %u\n", data) ; 

} /* End of the demo function */ 

N 

J 

When the above program is run, it generates the result: 


The value of data is: 4294967271 


v 

/ 


c — Character Conversion 


A conversion character of c specifies that the associated argument is to be con- 
verted to (or from) a single character. 
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Figure 7 


s — String Conversion 


Figure 7 


e — Exponential Floating 
Conversion 


13 Example of c Format Specification 

r ■- 1 N 

main () 

{ 

static char data [10] = "Hi there!"; 

printf ("Parts of data are: %c %c %c\n", 

data [0], data [8], data [4] ) ; 


} /* End of the demo function */ 

v - > 

When the above program is run, it generates the result: 


/ 


— 

Parts of data are: H ! 

! h 


V 


> 


A conversion character of s specifies that the associated argument is a string. 
Characters from the string are printed until a null character is found, or until the 
number of characters indicated by the precision specification (see below) are 
used up. 

14 Example of s Format Specification 

V 

main () 

1 

static char data [] = "Hello, World!"; 
printf ("The value of data is: '%s'\n", data); 


} /* End of the demo function */ 


When the above program is run, it generates the result: 


r 

The value of data is: 'Hello, World!' 





A conversion character of e specifies that the associated argument is assumed to 
be a float or a double. It is converted to (or from) a decimal exponential 
notation of the form 


[ - ] m . nnnnnnnE [±] xx 

V 4 


where the length of the string of n ' s is specified by the precision. The default pre- 
cision is six decimal places. 
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Figure 7-15 Example of e Format Specification 

p — ■ — • — — — — — \ 

main () 

{ 

float data = 123.456; 

printf ("The value of data is: %e\n", data); 

} /* End of the demo function */ 

t 

When the above program is run, it generates the result: 

f s 

The value of data is: 1.234560e+02 

V / 


f — Fractional Floating A conversion character of f specifies that the associated argument is assumed to 

Conversion be a f loat or a double. It is converted to (or from) a fixed-decimal notation. 


r 


[ - ] mmm . nnnnnn 


\ 

) 


where the length of the string of n’s is specified by the precision. The default 
precision is six decimal places. The precision does not determine the number of 
digits printed in f format, but the number of decimal places displayed. 

Figure 7-16 Example of f Format Specification 


r 

main () 

1 

float data = 123.456; 

— ^ 

printf ("The value of data is: %f\n", data); 


} /* End of the demo function */ 

■ ■ • ; 

i J 

When the above program is mn, it generates the result: 

The value of data is: 123.456001 

< 

'S 

J 


g — Adaptable Floating A conversion character of g specifies that the associated argument is to be con- 

Conversion verted to (or from) either e or f format, depending upon which is the shorter. 

Non-significant zeros are not printed in g format. This is similar to FORTRAN’S 
G format conversion. 
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Figure 7-17 Example of g Format Specification 

M — T _______ — ■ ■ ■ , ! . ' ' ' ' ' ' ' ^ 

main () 
l 

float 

data - 123 . 456; 

printf ("The value of data is: %g\n”, data); 


} ? 7* End of the demo function */ 

v — — : : :: : : 


When the above program is run, it generates the result: 


' 

The value of data is: 123.456 


v 

J 


Literal Character Output If the character which follows the % sign is not a conversion character, that char- 

acter is printed as is. Thus, to print a % sign, use a format conversion of %%. 


Figure 7-18 Example of Literal Character Output 

r — 1 — \ 

main () 

int 

data - 25; 

printf ("The value of data is: %y %%\n”, data) ; 

} /* End of the demo function */ 


V : : 

When the above program is run, it generates the result: 

: : : J 

r 

The value of data is : y % 


v 

7 


The two percent signs are displayed as one, and the unknown conversion charac- 
ter (y) is output as is. The value of the data variable in the output list is simply 
ignored, since no conversion specification in the format required data. 

Optional Format Modifiers Between the % sign and the format conversion letters as defined above, there may 

be some optional information. The characters which may appear in these posi- 
tions are described below. 
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Left Justify Field 


Minimum Field Width and 
Precision Specifications 


A minus sign (-) appearing before the conversion character specifies that the 
argument is to be left-justified in the output field. The minus sign is optional. 

After the minus sign can appear width and precision specifications, as described 
next. 

The form of the optional field width and precision specifications are: 

□ a digit string, which specifies a minimum field width. The converted 
number is printed in a field at least this wide, and wider if required. 
If the converted argument has fewer characters than the field width, 
it is padded on the left (or on the right, if a minus sign was given) 
with enough padding characters to make up the specified field width. 
The padding character is normally a space. If the field width is 
specified with a leading zero the output field is padded with zeros. 

□ a period character, which separates the field width from the next 
digit string. 

□ a digit string, which is the precision. The precision means one of 
two things. In the case of a float or a double argument, the pre- 
cision is the number of digits to be printed to the right of the decimal 
point. In the case of a string argument, the precision is the number 
of characters to be printed from the string. 

The examples below show the way that the justification, width, and precision 
specifications apply to string values when they are output. The value to be 
printed is the string "Wizard", which is six characters long. It is printed in a 
variety of format specifications, and there are vertical bars at either end of the 
field to show the extent of the field. 


Figure 7-19 Example of Field Width Specifications 


main ( ) 

{ 

static char data ;[] - "Wizard"; 


printf ("data in 
printf ("data in 
printf ("data in 
printf ("data in 
printf ("data in : 
printf ("data in 
printf ("data in 


%%4s format is: |%4s:|\n", data); 

%%-4s format is: |%-4s:|\n", data) ; 

%%10s format is: | %10s : I \n", data); 

%%-10s format is: I %-10s : | \n" , data); 
%%10.4s format is: | %10 . 4s : I \n", data); 
%%-10.4s format is: | %-10 . 4s : | \n" , data); 
%%.4s format is: | % . 4s: I \n", data); 


/* End of the demo function */ 


When the above program is run, it generates the results: 


Xr microsystems 


Revision A of 27 March, 1990 






56 C Programmer ’ s Guide 


Length Modifier 


( \ 

data in %4s format is: | Wizard I 
data in %-4s format is: | Wizard | 
data in %10s format is: | Wizardl 
data in %-10s format is: | Wizard I 
data in %10.4s format is: I Wiza | 

data in %-10.4s format is: |Wiza I 

data in %.4s format is: |Wiza| 

v * 


If the conversion specification is preceded by a lx, it means that the associated 
argument is a long while If indicates a double. If no length modifier pre- 
cedes the conversion specification, the associated argument is assumed to be an 
int. A lone 1 preceding the conversion specification is ignored in Sun C 
because ints and longs are the same. 

In calls to scanf ( ) , the arguments are pointers. Sizes in format specifiers must 
be correct: use %f for floats and %lf for doubles. 
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8.1. Character 
Classification 


is alpha ( ) — Is Character 
Alphabetic 

i supper ( ) — Is Character 
Uppercase Letter 
islower () — Is Character 
Lowercase Letter 
is digit ( ) — Is Character 
Decimal Digit 



String-Handling Functions 


The C programming language has no language-defined facilities for manipulating 
character string data. The C library does, however, provide a fairly rich set of 
primitives for manipulating character strings. 

This chapter discusses three major areas relating to string handling: 

□ Macros for classifying characters (is a character, uppercase, letter, 
digit, and such), plus macros for doing some minimal conversions 
(convert uppercase to lowercase). 

□ Functions for handling null -terminated strings. 

o Functions for handling bit strings and byte strings. 

The following macros classify ASCII-coded integer values. Each is a predicate 
returning nonzero for true, zero for false. isascii() is defined for all integer 
values; the rest are defined only where isascii (c) is true and on the single 
non-ASCII value EOF(see stdio( 3S)). 

You should have the line: 


( 

#include <ctype.h> 

\ 


at the beginning of any program unit that uses these macros, 
is alpha ( c) c is a letter — a through z or A through Z. 
i s upper ( c ) c is an upper case letter — A through Z . 
islower (c) cis a lower case letter — a through z. 
isdigit(c) cis a digit — 0 through 9. 


&sun 
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isxdigit ( ) — Is Character isxdigit (c) c is a hexadecimal digit — 0 through 9, a through f , or A 
Hexadecimal Digit through F. 

isalnum() — Is Character isalnum(c) c is an alphanumeric character, that is, c is a letter or a digit. 

Letter or Digit 

isspaceO — Is Character isspace(c) c is a space, tab, carriage return, newline, or formfeed. 

Whitespace 

ispunctO — Is Character ispunct(c) c is a punctuation character (neither control nor alphanumeric) 

Punctuation 

isprintO — Is Character isprint(c) c is a printing character, such as ASCII characters 0x20 (space) 
Printable through 0x7E (tilde). 

iscnt rl ( ) — Is Character is cntr 1 ( c) c is a delete character (0x7F) or an ordinary control character 

Control Character (less than 0x20). 

isascii (c) c is an ASCII character less than 0x80. 

is graph ( c) c is a visible graphic character, an ASCII character code from 
0x21 (exclamation mark) through 0x7E (tilde). 

These macros perform simple conversions on single characters. 

toupper (c) converts c to its upper-case equivalent. Note that this only works 
as expected if c is known to be a lower-case character to start with (presumably 
checked by is lower ( ) ). 

tolower (c) converts c to its lower-case equivalent. Note that this only works 
as expected if c is known to be an uppercase character to start with (presumably 
checked by is upper ( ) ). 

toascii(c) masks c with the value 0x7F so that its result is guaranteed to be 
an ASCII character in the range 0 thru 0x7F. 

Null-terminated strings are arrays of characters. A correctly formed string has a 
zero (ASCII NUL) byte at the end to act as a terminator. All string handling rou- 
tines and I/O routines conform to these semantics. C builds in this notion when a 
programmer writes a string constant — the compiler correctly adds the null byte 
at the end of the string. Suppose you have this declaration in your program: 



isascii ( ) — Is Character 
an ASCII Character 
isgraph ( ) — Is Character a 
Visible Graphic 

8.2. Character Conversion 
Macros 

toupper ( ) — Convert 
Lowercase to Uppercase 

tolower ( ) — Convert 
Uppercase to Lowercase 


toascii() — Ensure 
Character is ascii 

8.3. Functions for Handling 
N ull-T emanated 
Strings 
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Such a string appears in memory as: 


Figure 8-1 Layout of Null-Terminated String in Memory 


H 

i 
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h 

e 

r 
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\ 0 


Functions described in this section operate on null-terminated strings. They do 
not check for overflow of any receiving string. 

You must have the line: 



♦include <strings.h> 

\ 


at the beginning of any program unit that uses the functions described here. 


Null Pointers versus Null On Sun workstations (and on most other machines), you cannot use a zero 

Strings pointer to indicate a null string. Dereferencing a null pointer is an error and 

results in aborting the program. If you wish to indicate a null string, you must 
have a pointer that points to an explicit null string. 

Programmers using NULL to represent an empty string should be aware that such 
programs work by coincidence, if at all, rather than by intent and should be 
aware that testing for zero pointers is inherently nonportable. 


st rlen ( ) — Find Length of 
String 


>, 

strlen (s) 
char *s; 

v. , 


strlen ( ) returns the number of non-null characters in 5. 


strcmpO and strncmpO 

— Compare Strings 


/ 

\ 

strcmp (string_l, string_2) 


char *string 1, *string 2; 


V 





strncmp (string_l, string_2, n) 
char *string_l, *string_2; 

V 


st rcmp ( ) compares its arguments and returns an integer greater than, equal to, 
or less than 0, according as string l is lexicographically greater than, equal to, or 
less than string _2. 
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strncmp ( ) makes the same comparison but examines at most n characters. 

st rcmp ( ) uses native character comparison, which is signed on Sun worksta- 
tions. 



strcpy ( ) copies string string _2 to string _1, stopping after the null character 
has been moved. strncpyO copies exactly n characters, truncating or null- 
padding string _2; the target may not be null-terminated if the length of string _2 
is n or more. Both return string _1 . 



strcat ( ) appends a copy of string string _2 to the end of string string !. 

st meat ( ) appends n characters at most. Both return a pointer to the null- 
terminated result. 


index ( ) and rindex ( ) — index ( ) returns a pointer to the first occurrence of character c in string s, or 
Find Character in String zero if c does not occur in the string. 

rindex ( ) returns a pointer to the last occurrence of character c in string s, or 
zero if c does not occur in the string. 
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8.4. Byte String and Bit 
String Functions 


Functions described in this section operate on byte strings and bit strings. They 
do not recognize null-terminated strings, unlike the functions described in Sec- 
tion 8.3. 


bcmp ( ) — Compare Byte 
Strings 


t>copy ( ) — Copy Byte 
Strings 


/ \ 

bcmp(bl, b2, length) 
char *bl, *b2; 
int length; 

, j 


bcmp ( ) compares length bytes at address bl against length bytes at address b2, 
returning zero if they are identical, nonzero otherwise. 


f 

A 

bcopy (bl, b2, length) 


char *bl, *b2; 


int length; 


V 

J 


bcopy ( ) copies length bytes, in lefit-to-right order, from string bl to string b2. 

Overlapping strings are handled correctly. 

Note: The order of arguments is backwards from that of st rcpy ( ) — that 

is, bcopy ( ) copies from its first argument to its second argument, 
while strcpy ( ) copies from its second argument to its first argu- 
ment. 


bzero() — Clear Byte 
String to Zero 


f 

A 

bzero (b, length) 


char *b; 


int length; 


V 

/ 


bzero ( ) zeroes length bytes in the string b. 


f f s ( ) — Find First Bit Set 


r 

"\ 

ffs(i) 


int i ; 


\ 

J 


f f s ( ) finds the first bit set in the argument passed it and returns the index of 
that bit. Bits are numbered starting at 1 from the right. A return value of -1 
indicates the value passed is zero. 
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Low-Level File I/O 


This appendix describes the bottom level of I/O on the SunOS system. The 
lowest level of I/O in SunOS provides no buffering or any other services except 
moving data; it is, in fact, a direct entry into the operating system. You are 
entirely on your own, but on the other hand, you have the most control over what 
happens. And since the calls and usage are quite simple, this isn’t as bad as it 
sounds. 

A.l. File Descriptors In SunOS, all I/O is done by reading or writing files, because all peripheral dev- 

ices, even the user’s terminal, are files in the file system. This means that a sin- 
gle, homogeneous interface handles all communication between a program and 
peripheral devices. 

In the most general case, before reading or writing a file, it is necessary to inform 
the system of your intent to do so, a process called ‘opening’ the file. If you are 
going to write on a file, it may also be necessary to create it. The system checks 
your right to do so: does the file exist? Do you have permission to access it? If 
all is well, the system returns a small positive integer called a file descriptor. 
From then on, whenever I/O is to be done on the file, the file descriptor is used 
instead of the name to identify the file. This is roughly analogous to the use of 
READ ( 5 , . . . ) and WRITE ( 6 , . . . ) in FORTRAN. All information about an 
open file is maintained by the system; the user program refers to the file only by 
the file descriptor. 

File pointers are similar in spirit to file descriptors, but file descriptors are more 
fundamental. A file pointer is a pointer to a structure that contains, among other 
things, the file descriptor for the file in question. 

Since input and output involving the user’s terminal are so common, special 
arrangements exist to make this convenient. When the command interpreter (the 
‘shell’) runs a program, it opens three files, with file descriptors 0, 1, and 2, 
called standard input, standard output, and standard error output. All of these are 
normally connected to the terminal, so if a program reads file descriptor 0 and 
writes to file descriptors 1 and 2, it can do terminal I/O without opening the files. 

If I/O is redirected to and from files with < and >, as in 


r \ 

tutorial! prog < infile > outfile 
v / 


microsystems 


63 


Revision A of 27 March, 1990 



64 C Programmer’s Guide 


the shell changes the default assignments for file descriptors 0 and 1 from the ter- 
minal to the named files. Similar observations hold if the input or output is asso- 
ciated with a pipe. Normally file descriptor 2 remains attached to the terminal, 
so error messages can go there. In all cases, the file assignments are changed by 
the shell, not by the program. The program does not need to know where its 
input comes from nor where its output goes, so long as it uses file 0 for input and 
1 and 2 for output. 

A.2. read() and All input and output is done by two functions called read ( ) and write (). 

write ( ) The first argument for both of these functions is a file descriptor. The second 

argument is a buffer in your program where the data is to come from or go to. 

The third argument is the number of bytes to be transferred. The calls are 


f — 

n_read = read(fd, buf, n) ; 
rewritten = write (fd, buf, n) ; 


V 




Each call returns a byte count which is the number of bytes actually transferred. 
On reading, the number of bytes returned may be less than the number asked for, 
because fewer than n bytes remained to be read in the buffer. When the file is a 
terminal, read ( ) normally reads only up to the next newline, which is generally 
less than what was requested. A return value of zero bytes implies end of file, 
and -1 indicates an error of some sort. For writing, the returned value is the 
number of bytes actually written; it is generally an error if this isn’t equal to the 
number supposed to be written. 

The number of bytes to be read or written is quite arbitrary. The two most com- 
mon values are 1, which means one character at a time (‘unbuffered’), and 1024, 
corresponding to the physical blocksize on many peripheral devices. This latter 
size will be most efficient, but even character-at-a-time I/O is not inordinately 
expensive. 

Note The file stdio defines the constant BUF512, but in the following small exam- 
ples, it is more efficient to have the definition in place. 

Putting these facts together, we can write a simple program to copy its input to 
its output. This program will copy anything to anything, since the input and out- 
put can be redirected to any file or device. 
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. . \ 

♦define BUFSIZ 1024 s 

main() /* copy input to output */ 

{ 

char buf [BUFSIZ]; 
int n; 

while ( (n = read(0, buf, BUFSIZ)) > 0) 
write (1, buf , n) ; 
exit ( 0 ) ; 

} 


If the file size is not a multiple of BUFSIZ, some read ( ) will return a smaller 
number of bytes, and the next call to read ( ) after that will return zero. 

It is instructive to see how read ( ) and write ( ) can be used to construct 
higher-level routines like get char ( ) , put char ( ) , etc. For example, here is 
a version of get char ( ) which does unbuffered input. 


• • ' 

♦define CMASK Oxff /* for making char's > 0 */ 

♦define EOF (-1) 

getcharO /* unbuffered single character input */ 

1 

char c; 

return ( (read (0, & c, 1) >0) ? c & CMASK : EOF); 

1 

V — — — 


c must be declared char, because read ( ) requires a character pointer. The 
character being returned must be masked with Oxff to ensure that it is positive; 
otherwise sign extension may make it negative. The constant Oxff is appropri- 
ate for Sun workstations but not necessarily for other machines. 

The second version of get char ( ) does input in big chunks, and hands out the 
characters one at a time: 
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A.3. open ( ) , close () , Other than the default standard input, output and error files, you must explicitly 
un 1 ink ( ) open files in order to read or write them. 

open ( ) is rather like the f open ( ) discussed in the previous section, except 
that instead of returning a file pointer, it returns a file descriptor, which is just an 
int. 



As with f open ( ) , the name argument is a character string corresponding to the 
external file name. The access mode argument is different, however: rwmode is 
0 for read, 1 for write, and 2 for read and write access, open ( ) returns -1 if an 
error occurs; otherwise it returns a valid file descriptor. 

It is an error to try to open ( ) a file that does not exist. 

In the SunOS file system, there are nine bits of protection information associated 
with a file, controlling read, write and execute permission for the owner of the 
file, for the owner’s group, and for all others. Thus a three-digit octal number is 
most convenient for specifying the permissions. For example, 0755 specifies 
read, write and execute permission for the owner, and read and execute permis- 
sion for the group and everyone else. For more information about permissions, 
read the manual page for chmod(l). 

To illustrate, here is a simplified version of the SunOS utility cp, a program 
which copies one file to another. The main simplification is that our version 
copies only one file, and does not permit the second argument to be a directory: 
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A.4. Random Access — 

lseek ( ) 



♦define NULL 0 
♦define BUFSIZ 1024 

♦define PMODE 0644 /* RW for owner, R for group & others * 


/* cp: copy f 1 to f2 */ 


ma in ( a r go , a r g v ) 
int argc; 
char *argv[ ]; 


int fl, f2, n; 

Char buf [BUFSIZ] 


if (argc != 3) 

error ("Usage : cp from to”, NULL); 
if ( (fl = open (argv [1] , 0)) == -1) 

error ("cp: can't open %s", argv [1] ) ; 
if ( (f2 = creat (argv[2] , PMODE)) == -1) 
error ("cp: can't create %s", argv[2]) 


while ( (n = readffl, buf, BUFSIZ)) > 0) 
if (write(f2, buf, n) != n) 

error ("cp: write error", NULL) ; 

exit ( 0 ) ; 


There is a limit (typically 64) on the number of files which a program may have 
open simultaneously. Accordingly, any program which intends to process many 
files must be prepared to reuse file descriptors. The routine close ( fd) breaks 
the connection between a file descriptor and an open file, and frees the file 
descriptor for use with some other file. File descriptors 0, 1, and 2 can also be 
closed if you need to obtain extra file descriptors. Program termination through 
exit or return from the main program closes all open files. 

The function unlink (filename) removes the file filename from the file 
system. 


File I/O is normally sequential: each read () or write () takes place at a 
position in the file right after the previous one. When necessary, however, the 
data in a file can be read or written in any arbitrary order. The system call 
lseek ( ) provides a way to move around in a file without actually reading or 
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' \ 

lseek(fd, offset, origin); 

v. , 

forces the current position in the file whose descriptor is f d to move to position 
offset, which is taken relative to the location specified by origin. Subse- 
quent reading or writing will begin at that position, of f set is a long; f d and 
origin are int’s. origin can be 0, 1, or 2 to specify that offset is to be 
measured from the beginning, from the current position, or from the end of the 
file, respectively. For example, to append to a file, seek to the end before writ- 
ing: 


— — \ 

lseek(fd, 0L, 2); 

v , 


Note that in this case, if offset were nonzero, the length of the file would be 
extended by offset. 

To get back to the beginning (‘rewind’), 





lseek (fd, 0L, 0); 





Notice the 0L argument; it could also be written as ( long) 0. 

With lseek ( ) , it is possible to treat files more or less like large arrays, at the 
price of slower access. For example, the following simple function reads any 
number of bytes from any arbitrary place in a file. 


■ - ■ > 

get(fd, pos, buf, n) /* read n bytes from position pos */ 
int fd, n; 
long pos; 
char *buf; 

{ 

lseek (fd, pos, 0); /* get to pos */ 

return (read (fd, buf, n) ) ; 

1 

< J 


A.5. Error Processing The routines discussed in this section, and in fact all the routines which are direct 

entries into the system, can incur errors. Usually they indicate an error by return- 
ing a value of — 1. Sometimes it is nice to know what sort of error occurred; for 
this purpose all these routines, when appropriate, leave an error number in the 
external variable errno. The meanings of the various error numbers are listed 
in intro { 2) in the Sun System Interface Manual so your program can, for exam- 
ple, determine if an attempt to open a file failed because it did not exist or 
because the user lacked permission to read it. Perhaps more commonly, you may 
want to display the reason for failure. The routine perror displays a message 
associated with the value of errno; more generally, sys_errno is an array of 
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character strings which can be indexed by errno and displayed by your pro- 
gram. 
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f read ( ) 

File 


f write ( ) 

File 



Binary I/O 


The binary I/O facilities of the C library provide for record-oriented sequential 
access to files. 

WARNING Using these routines may result in data incompatabilities when porting pro- 
grams to or from some other machines. See the description of Sun’s External 
Data Representation (XDR) standard for creating portable code as described in 
Network Programming 

Read Data from The f read ( ) function reads some number of objects into a block, from a 
specified file. The interface to fread() is: 

V 

f read (pointer, sizeof *pointer, items, stream) 
char *pointer; 
int items ; 

FILE *stream; 

> 

The arguments to f read ( ) have the following meanings: 
pointer is a pointer to a block of objects 

items is a count of the number of objects of a data type determined by the 

type of whatever pointer points to 

stream is the named input stream 

The value of the fread() function is the number of objects actually read. 

- Write Data to The f write ( ) function writes some number of objects from a block, onto a 
specified file. The interface to f write ( ) is: 


— v 

fwrite (pointer, sizeof *pointer, items, stream) 
char *pointer; 
int items ; 

FILE *stream; 

. J 
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The arguments to f write ( ) have the following meanings: 
pointer is a pointer to a block of objects 

items is a count of the number of objects of a data type determined by the 
type of whatever pointer points to 

stream is the named output stream 

The value of the f wr it e ( ) function is the number of objects actually written 
to the named stream. 
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Memory Management 


These routines provide a general-purpose memory allocation package. They 
maintain a table of free blocks for efficient allocation and coalescing of free 
storage. When there is no suitable space already free, the allocation routines call 
sbrk (see brk(2)) to get more memory from the system. 

Each of the allocation routines returns a pointer to space suitably aligned for 
storage of any type of object. They return a null pointer if the request cannot be 
completed. 


C.l. malloc ( ) — 

Allocate Memory 


allocates num bytes. The pointer returned is aligned so as to be usable for any 
purpose. NULL is returned if no space is available. The result of malloc (0 ) is 
undefined. 

C.2. free() — Free 
Allocated Memory 


f ree ( ) frees up memory previously allocated by malloc () . Disorder can be 
expected if the pointer was not obtained from malloc ( ) . 

C.3. calloc ( ) — 

Allocate Memory for 
C Objects 


— 

> 

char *calloc (num, size); 


unsigned num; 


unsigned size; 


S 

y 


' 

int free(ptr) 

char *ptr; 

^ 



A 

char *malloc (num) 


unsigned num; 


k 

y 


allocates space for num items, each of size size. The space is guaranteed to be 
set to 0 and the pointer is aligned so as to be usable for any purpose. NULL is 
returned if no space is available. 
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C.4. cfree() — Free 
Allocated Memory 


C.5. realloc ( ) — realloc ( ) changes the size of the block referenced by ptr to size bytes and 

Change Size of returns a pointer to the (possibly moved) block. The contents will be unchanged 

Allocated Block up to the lesser of the new and old sizes. For backwards compatibility, real- 

loc ( ) accepts a pointer to a block freed since the most recent call to mal- 
loc ( ) , calloc () , realloc () , valloc ( ) , ormemalign () . Note that 
using realloc ( ) with a block freed before the most recent call to malloc ( ) , 
calloc ( ) , realloc ( ) , valloc ( ) , ormemalign ( ) is an error. 


' 

A 

char *realloc (ptr, size) 


char *ptr; 


unsigned size; 


V. 



/ ‘ - 

A 

(void) cfree(ptr, num, size) 


char *ptr; 


unsigned num; 


unsigned size; 


v 

J 


Space is returned to the pool used by calloc () . Disorder can be expected if 
the pointer was not obtained from calloc () . 


C.6. memalign() — 

Allocate to Alignment 
Boundary 


memalign ( ) allocates size bytes on a specified alignment boundary, and 
returns a pointer to the allocated block. The value of the returned address is 
guaranteed to be an even multiple of alignment bytes. Note that the value of 
alignment must be a power of two, and must be greater than or equal to the size 
of a word. 


/ 

A 

char *memalign (alignment, size) 


unsigned alignment; 


unsigned size; 


v 

-i 


realloc ( ) , valloc ( ) , and memalign ( ) return NULL and set errno if 
arguments are invalid, or if there is insufficient available memory, or if the heap 
has been detectably corrupted, for example, by storing outside the bounds of a 
block. 


C„7. valloc () — 

Allocate Memory on a 
Page Boundary 


valloc (size) is equivalent to memalign (getpagesize () , size). 


* 

A 

char *valloc (size) 


unsigned size; 


v 

J 




microsystems 


Revision A of 27 March, 1990 







Appendix C — Memory Management 75 


C.8. alloca () — 

Allocate Memory on 
Stack 


alloca ( ) allocates size bytes of space in the stack frame of the caller, and 
returns a pointer to the allocated block. This temporary space is automatically 
freed when the caller returns. 


f > 

char *alloca (size) 
int size; 

V / 


Warning alloca ( ) is both machine- and compiler-dependent; its use is strongly 

discouraged. It is possible to request more stack space than is available, but if 
you do, there is no way to detect this condition. 


C.9. Memory Allocation More detailed diagnostics can be made available to programs using the memory 

Debugging management routines described in this chapter by including a special relocatable 

object file at link time. This file also provides routines for control of error han- 
dling and diagnosis, as defined below. Note that these routines are not defined in 
the standard library. 


malloc_debug ( ) — Set 

Debug Level 


r 

A 

int malloc debug (level) 


int level; 


V 

J 


malloc_debug ( ) sets the level of error diagnosis and reporting during subse- 
quent calls to malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , 
memalign ( ) , cf ree ( ) , and free ( ) . The value of level is interpreted as 
follows: 

0 malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , memalign ( ) , 
cf ree ( ) , and free ( ) behave the same as in the standard library. 

1 malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , memalign ( ) , 
cf ree ( ) , and free ( ) abort with a message to stderr if errors are 
detected in arguments or in the heap. If a bad block is encountered, 
its address and size are included in the message. 

2 Same as level 1, except that the entire heap is examined on every call 
to malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , 
memalign ( ) , cf ree ( ) , and free ( ) . 

malloc_debug ( ) returns the previous error diagnostic level. The default 
level is 1 . 


malloc_verify ( ) — 

Check Storage Allocation 
Heap 

malloc_verify () attempts to determine if the heap has been corrupted. It 
scans all blocks in the heap (both free and allocated) looking for strange 


f 

A 

int malloc_verify ( ) 


V 

J 
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addresses or absurd sizes, and also checks for inconsistencies in the free space 
table. malloc_verif y ( ) returns 1 if all checks pass without error, and other- 
wise returns 0. The checks can take a significant amount of time, so it should not 
be used indiscriminately. 

The file /usr/lib/debug/malloc . o contains the diagnostic versions of 
malloc ( ) , free ( ) , etc. 

malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , memalign ( ) , 
cf ree ( ) , and free ( ) set errno as follows: 

EINVAL an invalid argument was given. The value of ptr given to free () , 
cfree(),orrealloc() must be a pointer to a block previously 
allocated by malloc ( ) , calloc ( ) , realloc ( ) , valloc ( ) , 
or memalign ( ) . EINVAL is also true if the heap is found to have 
been corrupted. More detailed information may be obtained by ena- 
bling range checks using malloc_debug ( ) . 

ENOMEM size bytes of memory could not be allocated. 


C.10. Errors from Memory 
Management 
Routines 
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Sun C Data Representations 


This appendix describes how Sun C represents data in storage and the mechan- 
isms for passing arguments to functions. This chapter is intended as a guide to 
programmers who wish to write or use modules in languages other than C and 
have those modules interface to C code. 

D.l. Storage Allocation This section describes how storage is allocated to variables of various types. 

In general, any word value is always aligned on a two-byte boundary. Values 
that can fit into a single byte are aligned on a byte boundary. 


Table D- 1 Storage Allocation for Data Types 


Data Type 

Internal Representation 

char elements 

a single 8-bit byte. 

short integers 

one word (two bytes or 16 bits), aligned on a two-byte boun- 
dary. 

int and long 

32 bits (four bytes or two words), aligned on a two-byte boun- 
dary. 

float 

32 bits (four bytes or two words), aligned on a two-byte boun- 
dary. A float has a sign bit, 8-bit exponent and 23-bit frac- 
tion. On a Sun-4, they are aligned on 4-byte boundaries. 

double 

64 bits (eight bytes or four words), aligned on a word boundary. 

A double element has a sign bit, an 1 1-bit exponent and a 

52-bit fraction. On a Sun-4, they are aligned on 8-byte boun- 
daries. 


D.2. Data Representations 


Bit numberings of any given data element depend on the architecture in use: 
Sun-3s, Sun-4s, and SPARCStations use bit 0 as the most significant bit, with 
byte 0 being the most significant byte. 
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Integer Representations There are three integer types used in Sun C: short, int, and long . 

Table D-2 Representation of short 


Bits 

Content 

8-15 

Byte 0 

0-7 

Byte 1 


Table D-3 Representation of int and long 


Bits 

Content 

24-31 

Byte 0 

16-23 

Byte 1 

8-15 

Byte 2 

0-7 

Byte 3 


float and double float and double data elements are represented according to the ANSI IEEE 

Representation 754-1985 standard. The tables below, 

s = sign (1 bit) 

e = biased exponent ( 1 lbits) 

/ = fraction (23 bits) 

u = unsigned 


Table D-4 float Representation 


Bits 

Name 

Content 

31 

Sign 

1 iff number is negative. 

23-30 

Exponent 

Eight-bit exponent, biased by 127. Values of all zeros, and all 
ones, reserved. 

0-22 

Fraction 

23-bit fraction component of normalized significand. The "one" 
bit is "hidden". 
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Table D-5 double Representation 


Bits 

Name 

Content 

63 

Sign 

1 iff number is negative. 

52-62 

Exponent 

Eleven-bit exponent, biased by 1023. Values of all zeros, and all 
ones, reserved. 

0-51 

Fraction 

52-bit fraction component of normalized significand. The "one" 
bit is "hidden". 


A float or double number is represented by the form: 



Extreme Number 
Representation 


where “l.f” is the significand and “f” is the bits in the significand fraction. 

Normalized float and double numbers are said to contain a "hidden" bit, 
providing for one more bit of precision than would otherwise be the case. 


Table D-6 float Representations 


normalized number (0<e<255): 

^exponent— \21) j jr 

subnormal number (e=0, f!=0): 

2 d 26 ) 1 j 

zero (e=0, f=0): 

(-1 ) Sisn 0 

signaling NaN 

Quiet NaN 

Infinity 

s=u, e=255(max); f=.0uuu-uu (at least one bit must be nonzero) 

s=u, e=255(max); f=.luuu-uu 

s=u, e=255(max); f=.0000-00 (all zeroes) 


Table D-7 double Representations 


normalized number (0<e<2047): 

^{exponent— \Q22) j y 

subnormal number (e=0, f!=0): 

20 022 ) 1 j 

zero (e=0, f=0): 

(-i) s, '* n 0 

signaling NaN 

Quiet NaN 

Infinity 

s=u, e=2047(max); f=.0uuu-uu (at least one bit must be nonzero) 

s=u, e=2047(max); f=.luuu-uu 

s=u, e=2047(max); f=.0000-00 (all zeroes) 
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Hexadecimal Representation 
of Selected Numbers 


Value 

float 

double 

+0 

00000000 

0000000000000000 

-0 

80000000 

8000000000000000 

+1.0 

3F800000 

3FF0 00 00 0000 0000 

-1.0 

BF800000 

BFF000 00 0000 0000 

+2.0 

40000000 

4000000000000000 

+3.0 

40400000 

4008000000000000 

+Inf inity 

7F800000 

7FF0 00 00 00 00 00 00 

-Infinity 

FF800000 

FFF0 00 00 00 00 00 00 

NaN 

7F8xxxxx 

7FFxxxxxxxxxxxxx 


A pointer in C occupies four bytes. The NULL value pointer is equal to zero. 

Arrays are stored with their elements in a specific storage order. The elements 
are actually stored in a linear sequence of storage elements. 

C arrays are stored in row-major order; the last subscript in a multi-dimensional 
array varies fastest. 

String data types are simply arrays of char elements. 

This subsection describes the results derived from applying the basic arithmetic 
operations to combinations of extreme and ordinary floating-point values. 

No traps or any other exception actions are taken. 

All inputs are assumed to be positive. Overflow, underflow, and cancellation are 
assumed not to happen. In all the tables below, the abbreviations have the fol- 
lowing meanings: 

Table D-8 Extreme Values Usage 


Abbreviation 

Meaning 

Num 

Subnormal or Normalized Number 

Inf 

Infinity (positive or negative) 

NaN 

Not a Number 

Uno 

Unordered 


Pointer Representation 
Array Storage 

Arithmetic Operations on 
Extreme Values 


The tables that follow describe the types of values that result from arithmetic 
operations performed with combinations of different types of operands. 
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Table D-9 Addition and Subtraction Results 


Addition and Subtraction 


Left Operand 


Right Operand 



0 

Num 

Inf 

NaN 

0 

0 

Num 

Inf 

NaN 

Num 

Num 

Num 

Inf 

NaN 

Inf 

Inf 

Inf 

Note 

NaN 

NaN 

NaN 

NaN 

NaN 

NaN 


Note: Inf + Inf = Inf; Inf - Inf = NaN 


Table D-10 


Table D-ll 


Multiplication Results 


Multiplication 

Left Operand 


Right Operand 



0 

Num 

Inf 

NaN 

0 

0 

0 

NaN 

NaN 

Num 

0 

Num 

Inf 

NaN 

Inf 

NaN 

Inf 

Inf 

NaN 

NaN 

NaN 

NaN 

NaN 

NaN 


Division Results 



Division 



Left Operand 


Right Operand 



0 

Num 

Inf 

NaN 

0 

NaN 

0 

0 

NaN 

Num 

Inf 

Num 

0 

NaN 

Inf 

Inf 

Inf 

NaN 

NaN 

NaN 

NaN 

NaN 

NaN 

NaN 
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Table D- 12 Comparison Results 


Comparison 

Left Operand 


Right Operand 



0 

Num 

Inf 

NaN 

0 

= 

< 

< 

Uno 

Num 

> 


< 

Uno 

Inf 

> 

> 


Uno 

NaN 

Uno 

Uno 

Uno 

Uno 


Note: NaN compared with NaN is Unordered, and also results in inequality. 

+0 compares equal to -0. 


D.3. Argument Passing 
Mechanism 


This section describes how arguments are passed in Sun C. 

All arguments to C functions are passed by value. 

Actual arguments are pushed onto the stack in the reverse order from which they 
are declared in a function declaration. 


Actual arguments which are expressions are evaluated before the function refer- 
ence. The result of the expression is then pushed onto the stack. 

Sun-3 On Sun-3s, functions return their results in register D 0 , or in registers D 0 and D 1 
when the result is a double value. 

Sun-4 On Sun-4s, functions return integer and float results in register %o0, while 
double results are returned in %of 0 and %of 1. 


All arguments, except doubles, are passed as four-byte values; a double is 
passed as an eight-byte value. All float values are passed as doubles. 

Upon return from a function, it is the responsibility of the caller to pop argu- 
ments from the stack. 


D.4. Referencing Data 
Objects in C 


This section describes how variables of different types are actually accessed (or 
referenced). The method and notations of access, of course, differ depending on 
whether the object is a simple variable, an array, a structure, or a union. 


Referencing Simple Variables 


A plain variable (of simple scalar type) is accessed by its identifier. Since such a 
simple variable has no structure, its identifier alone is enough to reference it. 
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Figure D-l Examples of Simple Variable References 



Referencing With Pointers A variable can also be declared as a pointer to another object. In this case, the 

reference to the object must be done with the pointer notation. Placing an aster- 
isk character * in front of an identifier uses that identifier as a pointer to an 
object, and the thing that is read from or written to is the object that the identifier 
points to. 

Figure D-2 Examples of Pointer References 



Referencing Array Elements When an identifier of an array type appears in an expression, the identifier is con- 
verted to a pointer to the first member of the array. 

The subscript operation [ ] is interpreted such that 
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is equivalent to the construct 


*((E1) + (E2)) 

v , 


Figure D-3 Examples of Array Variable References 

( 

/* Declare some array variables */ 

int egress [10]; 
float lightly [5] [5] ; 
char coal [100] ; 
extern double sin(); 
int idx; 
int idy; 

/* Now reference those variables */ 
for (idx = 0; idx < 10; idx++) 

egress [idx] = 10; /* Set int to a constant */ 

for (idx = 0; idx < 5; idx++) 

for (idy = 0; idy < 5; idy++) 

printf ("%f ", sin (lightly [idx] [idy] )) ; 

for (idx = 0; idx < 100; idx++) 

putc (coal [idx]); /* Write to standard output */ 

V „ 


Referencing Structures and There are only three operations which may be done on a structure or a union: 

1. A member of the structure or union can be referenced by means of the 
. or -> operator. 

2. The address of the entire structure or union can be taken, with the & 
operator. 

3. One structure can be copied to another of the same type with the 
assignment operator. 

The . operator is used in contexts where the structure or union identifier is avail- 
able directly to the expression. The -> operator is used when the identifier for 
the structure or union is a pointer to the object. Structures can also be passed as 
parameters, returned from functions, or assigned to variables of the same struc- 
ture or union type. 
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Figure D-4 


Examples of Accessing Members of Structures 


#def ine MAXLEN 256 
#def ine NULL 0 

demo (wanted) 

char *wanted; 


/* Declare a couple of structures */ 
struct { /* This one is fairly simple */ 

int level; 
char *cp; 

char pbuffer [MAXLEN] ; 

} putter; 

struct vallist { /* This one is a linked list */ 

char *name; 
char valtype; 
int value; 

struct vallist *nextval; 

} *valhead, *valtail; 

struct vallist *pointer; 

/* Now access the members */ 
putter. level = 10; 
for (i = 0; i < MAXLEN; i++) 

putter .pbuffer [i] = *putter.cp; 

/* Access members through pointers */ 
for (pointer = valhead; 

pointer != NULL; 

pointer = pointer->nextval) 
if (strcmp (pointer->name, wanted) == 0) 
return (pointer) ; 

} /* End of the demo function */ 
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E 


Sun C Extensions 


The language described by Kemighan and Ritchie in The C Programming 
Language (referred to hereafter as “K&R C”), while close to Sun C, is not identi- 
cal to it. The extensions to K&R C embodied in Sun C are described below, with 
the relevant section of Appendix A of The C Programming Language listed for 
each topic discussed. 


E.l. Keywords (§A.2.3) 


E.2. Name Spaces (§A.4) 


E.3. Characters and 
Integers (§A.6.I) 


Sun C includes the additional keywords void and enum. 

In Sun C, functions may be declared to return the type void. This means that 
the function doesn’t return any value, and so is functionally a subroutine. There 
are no objects of type void. 

Sun C provides separate address spaces for 

□ struct/union and enum tags 

□ Elements of each different type of struct/union 

□ Everything else: regular variables and functions 

K&R C provides two name spaces: one for struct/union tags, and the other 
for all variables, functions, typedef ’d names, and so on. 

Sun C’s characters are signed, and all ASCII characters are positive. Unsigned 
characters are, of course, unsigned, and promote to unsigned. See also refer- 
ence to 8.2 below. 


E.4. float and double In K&R C, whenever a float appears in an expression it is lengthened to dou- 
(§A.6.2) ble by zero-padding its fraction. 

In Sun C, floats are lengthened to doubles in expressions, but with consider- 
ably more work, since the exponent part is of a different width, and of a different 
bias. (See Chapter D for further discussion.) 

Sun C also provides a compiler option, -fsingle, to avoid this widening in 
expressions using only floats, -fsingle will not prevent float formal 
parameters from being rewritten as doubles, nor float-valued actual parame- 
ters from being promoted to double. 
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E.5. Arithmetic 

Conversions (§A.6.6) 

E.6. Primary Expressions 
(§A.7.1) 


E.7. Multiplicative 

Operators (§A.7.3) 


E.8. Storage Class 

Specifiers (§A.8.1) 


E.9. Type Specifiers 
(§A.8.2) 


E.10. Declarator Naming 
(§A.8.4 and §A.14.1) 
E.ll. struct and union 
Declarations (§A.8.5 
and §A.14.1) 


Unsigned char and unsigned short promote to unsigned. Since in Sun C 
long == int, nothing ever promotes to long. 

Sun C supports passing structs and unions by value. The C Programming 
Language does not discuss the possibility of passing structs or unions as 
value parameters since it is not allowed in K&R C. See §A.10.1 below. 

The C Programming Language states that % may not be applied to operands of 
type float. In Sun C, it may not be applied to operands of type double, 
either. Note that the sign of the remainder is the same as the sign of the divi- 
dend. 

In Sun C, any integral type (combinations of char, short, int, long, 
unsigned, and enum) and any pointer type may be assigned to registers. 
Depending on the hardware present, floats and doubles may be, too. 

In K&R C, only int, char, and pointer types may be assigned to registers 
with the register storage class. 

Sun C supports the scalar types char, unsigned char, int, short int, 
unsigned short int, long int, enum, float, and double. 

K&R C does not support the unsigned char, unsigned short int, or 
enum types. These types in Sun C promote to unsigned int rather than 
int. 

Sun C permits declaring a function returning a struct or union. 


Sun C permits you to both assign structs/unions and pass them as parame- 
ters. 

In Sun C, fields are packed left-to-right within a storage unit appropriate to the 
type they are declared to be. They may be declared as any of the integer type, 
and enum. No matter what their declaration, all fields are unsigned, and thus 
zero-extended for the purposes of "the normal conversions". 

In Sun C, interpretation of . and -> take into account the type of the 
struct/union or pointer expression on the left to determine the name on the 
right. This permits apparent clashes between offsets and types between members 
of different aggregates having the same name. The only difficulty comes if the 
type of the left-hand expression does not properly disambiguate the name, in 
which case: 

1 : If there is no ambiguity, then the only choice is taken and a warning is 

issued. 

2: If there is ambiguity, the program is considered to be in error. 
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E.12. Switch Statement 
(§A.9.7) 


Sun C accepts switch expression of types float, double (fixed to ints), and 
enum, as well as the integer types permitted by K&R C. 


E.13. External Function 
Definitions (§A.10.1) 


Sun C permits passing struct and union value parameters in external func- 
tions. 


E.14. Lexical Scope 
(§A.11.1) 


E.15. Scope of Externals 
(§A.11.2) 


E.16. Explicit Pointer 
Conversions 
(§A.14.4) 


Sun C does not "push down" an outer variable declaration in a compound state- 
ment if a variable of class extern is re-declared in an inner block. In this case, 
the inner declaration persists until the end of the file, and if it redeclares a name 
with a definition in an outer block, it will elicit a complaint from the compiler 
about redeclaring a variable. 

Sun C’s linking rules are somewhat more liberal than those implied by K&R C: 

□ C uninitialized global data are treated like FORTRAN uninitialized 
COMMON (a tentative definition). Sun C initialized data are are 
treated like FORTRAN COMMON initialized by BLOCK DATA (a 
true definition). 

□ A tentative definition in a library module will not cause the module to 
be loaded. A true definition will, if the the name occurs as a reference 
or tentative declaration in a module that is already being linked. (The 
"already" here is important since order matters.) 

□ If the linker sees any true definitions of a name among the modules to 
be linked, this definition overrides all tentative definitions. This 
includes the case where the true definition allocates less space for the 
named object than the tentative definition(s) would. 

□ If the linker sees no true definitions of a name, the name is defined by 
the linker, and space is allocated. The amount of space allocated 
should be the maximum of the size specified in any of the tentative 
definitions in the modules being linked. 

On Sun workstations, a pointer corresponds to a 32-bit integer, while addresses 
are measured in 8-bit bytes. Alignment of data depends on the particular plat- 
form. 

For more about data representation, see Chapter D . 


E.17. Constant Expressions Sun C permits cast operators as part of constant expressions, except in preproces- 

(§A.15) sor constant expressions (see §12.3), where the sizeof operator is also disal- 

lowed. 


E.18. Anachronisms 
(§A.17) 


Sun C does not recognize any of the anachronisms listed in §A. 17 of The C Pro- 
gramming Language. 
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